# Syntactic architecture and its consequences I

Syntax inside the grammar

Edited by András Bárány Theresa Biberauer Jamie Douglas Sten Vikner

### Open Generative Syntax

Editors: Elena Anagnostopoulou, Mark Baker, Roberta D'Alessandro, David Pesetsky, Susi Wurmbrand

In this series:


# Syntactic architecture and its consequences I

Syntax inside the grammar

Edited by András Bárány Theresa Biberauer Jamie Douglas Sten Vikner

Bárány, András, Theresa Biberauer, Jamie Douglas, Sten Vikner (ed.). 2020. *Syntactic architecture and its consequences I*: *Syntax inside the grammar* (Open Generative Syntax 9). Berlin: Language Science Press.

This title can be downloaded at: http://langsci-press.org/catalog/book/275 © 2020, the authors Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-96110-275-4 (Digital) 978-3-96110-276-1 (Hardcover)

ISSN: 2568-7336 DOI: 10.5281/zenodo.4041229 Source code available from www.github.com/langsci/275 Collaborative reading: paperhive.org/documents/remote?type=langsci&id=275

Cover and concept of design: Ulrike Harbort

Typesetting: András Bárány, Felix Kopecky, Jamie Douglas Proofreading: Ahmet Bilal Özdemir, Amir Ghorbanpour, Amy Lam, Ana Afonso, Andreas Hölzl, Andriana Koumbarou, Brett Reynolds, Carla Bombi Ferrer, Christopher Straughn, Conor Pyle, Daniel Wilson, George Walkden, Geoffrey Sampson, Jackie Lai, Jakub Sláma, Jeroen van de Weijer, Kate Bellamy, Ludger Paschen, Jean Nitzke, Radek Šimík, Sean Stalley, Teodora Mihoc, Timm Lichte, Tom Bossuyt Fonts: Libertinus, Arimo, DejaVu Sans Mono Typesetting software: XƎLATEX

Language Science Press xHain Grünberger Str. 16 10243 Berlin, Germany langsci-press.org

Storage and cataloguing done by FU Berlin

You say you want a revolution Well you know We all want to change the world You tell me that it's evolution Well you know We all want to change the world

Don't you know it's gonna be alright

— The Beatles, *Revolution 1*

# **Contents**


### Contents


Contents


# **Introduction**

András Bárány Leiden University

Theresa Biberauer University of Cambridge, Stellenbosch University, University of the West Cape

Jamie Douglas University of Cambridge

Sten Vikner Aarhus University

The three volumes of *Syntactic architecture and its consequences* present contributions to comparative generative linguistics that "rethink" existing approaches to an extensive range of phenomena, domains, and architectural questions in linguistic theory. At the heart of the contributions is the tension between descriptive and explanatory adequacy which has long animated generative linguistics and which continues to grow thanks to the increasing amount and diversity of data available to us. As the three volumes show, such data from a large number of understudied languages as well as diatopic and diachronic varieties of wellknown languages are being used to test previously stated hypotheses, develop novel ideas and expand on our understanding of linguistic theory.

The volumes feature a combination of squib- and regular-length discussions addressing research questions with foci which range from micro to macro in scale. We hope that together, they provide a valuable overview of issues that are currently being addressed in generative linguistics, broadly defined, allowing readers to make novel analogies and connections across a range of different research strands. The chapters in Volume 2, *Between syntax and morphology*, and Volume 3, *Inside syntax*, develop novel insights into phenomena such as syntac-

András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner. 2020. Introduction. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, v–viii. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972826

### András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner

tic categories, relative clauses, constituent orders, demonstrative systems, alignment types, case, agreement, and the syntax of null elements.

The contributions to the present, first volume, *Syntax inside the grammar*, address research questions on the relation of syntax to other aspects of grammar and linguistics more generally. The volume is divided into two parts, dealing with language acquisition, variation and change (Part I), and syntactic interfaces (Part II).

The chapters in Part I, *Language acquisition, variation and change*, address questions such as the role of random drift in language change (Clark), complexity in grammars (Bejar, Massam, Pérez-Leroux, and Roberge), and the modelling of syntactic micro- and macro-variation across languages synchronically in Bantu and Polynesian languages (van der Wal; Travis), diachronically (Schifano and Cognola), and also across frameworks (Borsley; Vincent and Börjars). The chapters by Haeberli and Ihsane, Fuß and Trips, and van Kemenade provide novel insights into the diachrony of English verbs, subjects, and prepositions, respectively, while Vincent and Börjars' contribution shines light on the general notion of "heads" across time and across current syntactic frameworks, and Roussou focuses on the diachrony and grammaticalisation of complementisers.

Several chapters in Part II, *Syntactic interfaces*, explore how syntax and semantics interact in the context of decomposed functional structure, expanding on influential proposals on fine-grained distinctions in the *v*-domain (Chomsky 1995; Kratzer 1996) and the structuring of events (Borer 2003; 2005a,b; 2013; Ramchand 2008; 2018). Specific cases discussed here are the decomposition of passives (Biggs; Fadlon, Horvath, Siloni, and Wexler), telicity (Hu), split intransitivity (J. Baker), and verb-internal modifiers (Song). Questions about higher levels of clausal architecture, such as the lack of verbal wh-expressions (Irurtzun) and potential violations of the Final-over-Final Condition (Aboh) also feature in this part. Other chapters, in turn, tackle issues in the nominal domain, such as the syntax of nominal predication (Adger), a novel perspective on Binding Principles A and B (Richards), and questions on the syntax of classifiers and classifier languages (Lam). Finally, the syntax–phonology interface in several Bantu languages is the topic of Hyman's chapter.

Taken together, then, the contributions to this volume, many of which have clearly been influenced and inspired by Roberts (2010; 2012; 2014; 2019), Roberts & Roussou (2003), Roberts & Holmberg (2010), Biberauer & Roberts (2012; 2015), and Biberauer et al. (2014) give the reader a sense of the lively nature of current discussion of topics in synchronic and diachronic comparative syntax ranging from the core verbal domain to higher, propositional domains.

Introduction

## **References**


András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner


# **Part I**

# **Language acquisition and change**

# **Chapter 1**

# **Drift, finite populations, and language change**

# Robin Clark

University of Pennsylvania

History happens only once. This seems to set up an impenetrable barrier for social sciences, like historical linguistics, that concern themselves with change over time. We have the historical record to go on with no convincing way to generate alternative histories that could be used for hypothesis testing. Nevertheless, it is of some interest to ask whether what we see in the historical record is due to particular forces or whether the time series we see could be the result of random drift. In this paper, I will spell out some simple principles of random drift that can be used to construct null hypotheses against which we can study particular cases of language change. The study of random drift allows us to sharpen our analyses of language change and develop more constrained theories of language variation and change.

# **1 Introduction**

More years ago than I like to count, Ian Roberts and I wondered about the causal mechanisms of language change (Clark & Roberts 1993). At the time, the idea was that language change would happen when the learner cannot uniquely determine the grammar on the basis of linguistic evidence; in these circumstances, the learner would be inexorably driven toward the simpler analysis and the language would change. I can confess here that my own thinking about how this could happen was rather thin; I supposed that language contact, whether between different language groups or different sociolinguistic levels, would introduce ambiguities into the learner's evidence, thus driving change.

While there is no doubt that language contact is an important driver of language change, we should ask whether it is the *sole* driver of change. Suppose we

Robin Clark. 2020. Drift, finite populations, and language change. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 3–14. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972828

### Robin Clark

conclude that language contact is the sole driver of change; what are we to say about language diversity? Where does the diversity of languages come from, if not from multigenesis? Imagine, though, that a homogeneous linguistic group is isolated for a millennium; would the language really remain unchanged over that time, simply because the group had no contact with any other group?

Clearly, it is worth our while to investigate other potential sources for language change beyond the clear case of language contact. I will make the case, here, that random processes (drift) could be a source of language change. More precisely, the sampling error that arises from each individual's particular experience with language could be a source of language variation, particularly when amplified through a hierarchical social structure that includes language leaders, individuals who are taken as models by other members of their social group. In fact, as we shall see, this sort of variation is inevitable in finite populations, a fact that has long been known in population genetics (Crow & Kimura 1970).

## **2 Random processes and neutral models**

The Hardy–Weinberg model, an early model of gene frequencies in populations, had a simple structure that made it an appealing and simple model of change over time; the equation underlying the model is exceedingly friendly and has not only been used in biology but has also been usefully adapted to build mathematical models of social and cultural evolution (Boyd & Richerson 1985), since it can neatly express the relationship between two variant forms, and . 1 From a linguistic perspective, we could take and to be the probability of two linguistic variants that cannot be expressed simultaneously and are, thus, in competition with each other; for example, might be the likelihood of verb raising, while is the probability of leaving the verb in situ. In this case, of course, we would take to be 1 − .

Crucially, the model makes a number of assumptions about populations. First, there is random pairing; individuals do not "clump" together into groups depending on their preference for or . Second, it is assumed that selection is not operating on the population; in other words, one variant is not preferentially replicated. Third, mutation and migration are absent; new variants are not introduced that might compete with the existing options and there is no outflow or inflow

<sup>1</sup> See also Cavalli-Sforza & Feldman (1981), one of the earliest attempts to propose a populationbased model of cultural evolution; McElreath & Boyd (2007) is a good overview of mathematical models of social evolution, in particular their Chapter 1.

### 1 Drift, finite populations, and language change

of new variants. Thus, the model in its simplest form would put aside both innovation (mutation) and language contact (migration) as sources of change. Finally, and this point is crucial, the population is infinite in size so that frequencies of the variants are not subject to chance fluctuations.<sup>2</sup> These assumptions implied that, all else being equal, the population would quickly achieve a *mixed equilibrium state*. This means that, in the absence of selection, the population frequencies for the character in question would remain stable. The frequencies in a population, if disturbed, will quickly return to equilibrium. If, however, some force acts on the underlying frequencies, then the population will happily rest at the new frequency. One such force would be *selection*, where one variant is, for whatever reason, preferred over the other. An observed positive change in frequencies of a character would then imply that either positive selection was working on the characteristic or that negative selection was acting on the other variant of that characteristic.

To make the discussion concrete, suppose that the variants are (1) inversion of the subject and the main verb in questions or (2) insertion of an auxiliary verb which is then inverted with the subject. Suppose further that the frequency of the second variant is increasing. A Hardy–Weinberg model would treat this as either selection for the second variant or selection against the first variant. Otherwise, in the absence of selection, the relative frequencies of the two types should remain constant.

Although the model is appealingly simple, population biologists soon questioned the assumption that the population is infinite. Clearly infinite populations don't exist in nature, so it's of some interest to consider what happens in a finite population. So let's suppose that we have a finite population of individuals. Since we are interested in the spread of properties in a population, we can safely suppose that some features are replicated by a copying process. Since the population is finite, we can further suppose that some copies are removed from the population. More precisely, at each time step, one individual is randomly selected from the population according to a uniform distribution and copied and one individual is randomly selected and deleted. This is a *Moran process* (Moran 1958), and it is a simple model of how random forces due to sampling can act on a population. This process should have some resonance in linguistics, since variants might be randomly sampled in the population; by chance, I may have heard the past tense of *sneak* as *snuck* rather than *sneaked* and might, therefore, develop a preference for *snuck*. In general, because the process is sampling a finite population, chance becomes an important force so that large changes in population

<sup>2</sup>The literature on the Hardy–Weinberg model is extensive. Bergstrom & Dugatkin (2012) contains a highly accessible introduction to the mathematics.

### Robin Clark

frequencies could be due to random factors. Notice that these random changes can build up over time, resulting in a change going to completion; no other forces need to be acting on the population. Thus, a population will change over time in the absence of actual selection; see the discussion of the neutral model, below.

Figures 1.1–1.3 show the results of three different experiments with this random process in populations of various sizes, the process repeated fifteen times for each population size; in all these cases, we are simply applying the random sampling process described above to the population. In all these cases, the -axis shows the number of steps and the -axis shows the number of individuals bearing some variable trait, call it "." My interest here is simply to show the potential effects of population size, so what we will do is consider how this random process plays out on populations of different sizes, ranging from 10 individuals and going through orders of magnitude. We will briefly turn to the applicability to language below.

In Figure 1.1, the population consists of 10 individuals. We begin with half the population having the trait and the other half lacking it; the figure tracks the frequency of the trait in the population over time. By hypothesis, the trait itself has no consequences for either survival or reproduction. It is clear from Figure 1.1 that in a small population whatever the trait is, it quickly either takes over or is removed from the entire population. Since the trait has no consequences for survival or reproduction, the end result, whether it is fixation or elimination, is entirely up to chance, a function of random sampling. Because the population is so small, a great deal of variation emerges in short order. Small populations tend

### 1 Drift, finite populations, and language change

to have higher variance and will more quickly show the effects of random drift; in this case, a sample size of one has consequences for 10% of the population, so it is no wonder that the variance is so high. Note that the population quickly fixates, either with the entire population having or with being driven out of the population. This implies that in small populations it will be very difficult to distinguish selection for a trait from simple drift.

In Figure 1.2, the population is an order of magnitude larger than in the first experiment, with a population size of 100 individuals as opposed to the population of 10 individuals. Again, we begin with half of the population having the trait . The figure tracks the frequency of over time. We can see that variance increases over time, although the increase is slower than in the small population of 10 individuals. Despite the fact that the change in variance is slower than in the smaller population, it is still considerable after only 1,000 steps; in some repetitions the trait is present in about 90% of the population while in others the trait is present in only about 20% of the population. We can be sure that eventually, the population will eventually go to *completion*; variation will eventually disappear either when the entire population has or when is vanishes from the population; no middle course is possible (Sigmund 1993).

In Figure 1.3, I have increased the size of population by yet another order of magnitude, to 1,000 individuals, and followed the process for 15 repetitions of 1,000 steps. With this population, the variance grows even more slowly relative to the population size. Nevertheless, it is clear that the variance does grow, as can be seen by comparing the spread of the population from step 200 to step 1,000.

### Robin Clark

Indeed, as in the above cases, the population will eventually go to completion, although it may take a much long time to do so. It is as though increases in population size have the effect of increasingly stretching the diagram in Figure 1.1 while retaining the outcome: ultimate completion of the change after adequate time.

While I'm certainly not claiming that language change is a Moran process, these experiments show a number of important features of finite populations and random sampling. First, it is clear that random forces can be a compelling force on populations, one which, over the long term, can result in large changes in frequencies. Unlike the simple model on infinite populations, once we have observed a change in frequency of a trait in a finite population, we must ask whether that change of frequency can be accounted for solely in terms of a random force, like sampling error, or whether we must appeal to selection if we are to understand the change. This point holds for the time series of frequencies of variant linguistic features as much as it does for time series of the frequency of genes in a population. This fact has import consequences for the study of language change.

Second, the size of a population plays a crucial role in change; the smaller the population, the easier it is for chance to buffet the frequencies, resulting in large short-term changes of frequency of a variant in a small population. As the size of the population grows, it will be less likely that randomness will result in rapid changes in frequency. Thus, if we observe a rapid change in the frequency of a trait, the larger the population is, the more likely it is that the change is a result

### 1 Drift, finite populations, and language change

of selection rather than chance. In a small population, as Figure 1.1 illustrates, precipitous changes of frequency due to chance are not unusual.

The interpretation of population size with respect to language change is an important question. It seems to me unlikely that population, here, refers to the number of speakers of the language, although there will clearly be some relation between change and the number of speakers. If the relationship were a simple one, then we would expect languages with more speakers to change more slowly than languages with smaller numbers of speakers. I'm inclined to take population to be more intimately related to the frequencies of the forms in question. Extremely frequent forms should change only very slowly, while less frequent forms should be more inclined to drift. This accords well with the observation that irregular plural forms in English are likely to be frequent (*foot*/*feet*, *man*/*men*, *child*/*children*, *mouse*/*mice* and so on).<sup>3</sup> These reflect older stages of the language which retain vestiges of an older system but resisted change to the regular form by virtue of their "large population" (high frequency). Indeed, casting our net more widely, we see evidence that high frequency correlates with stability; highly frequent forms are more stable and retained longer while low frequency forms are less stable and are not retained as long; see, for example, Pagel et al. (2007) on rates of lexical change in Indo-European; Lieberman et al. (2007) and Newberry et al. (2017) for connections between word frequency and rates of change for irregular verbs in English.

Third, we want to be able to reliably distinguish changes that are consistent with random drift from changes that are more likely the result of selection. If we are to understand cases of language change (in particular) and social change (in general), we will want to have a method of classifying the changes we observe in those that are consistent with random drift as the sole force of change and those where we can reject random drift as the sole force. We cannot classify cases of change simply by looking at individual curves.

Consider the change shown in Figure 1.4, which was again generated by a Moran process. The curve shows a change in frequency of a trait for a population of 200 individuals. It looks sigmoidal, which we would expect if the trait was being selected for, but it was generated by the same Moran process used in the experiments shown in Figures 1.1–1.3; we know that the process did not involve selection although, by chance, this curve appears to be nicely sigmoidal. We cannot reject the hypothesis that a change is due to chance simply by looking at a curve with the naked eye. We need a reliable method that takes into account

<sup>3</sup> See Newberry et al. (2017) for some work on regular and irregular past tense in American English, as well as other changes including periphrastic *do* and verbal negation.

Robin Clark

population size, rates of change and so forth; the method should, furthermore, have wide application not only to language change, but to the quantitative study of other types of change so that we can accumulate evidence for the fundamental scientific reliability of the method.

Now look at the curve in Figure 1.5. This curve was again generated by a Moran process on a population of 50 individuals. The curve ultimately trends toward the trait dominating in the population, although the frequencies vary up and down

### 1 Drift, finite populations, and language change

in a seemingly random fashion. It is common practice to partition the data from historical corpora into time bins. In Figure 1.6, I've broken the frequencies used in Figure 1.5 into quintiles, calculated the average in each quintile and graphed the result. The new curve shows an initial decline in the trait "A" followed by an apparently smooth monotonic increase that could be selection; the curve is clearly sigmoidal. Of course, we know that the underlying process was simply random sampling of a small population.

This brings up an important point. Random processes are always at work during evolutionary change (Kimura 1983). Thus, when we see a change of frequency of a trait in a time series, we need to ask whether or not this change could be due to selection or whether that change is consistent with random drift. If we can rule out drift for a particular bit of language change, we can then ask why that trait was selected for (or, for that matter, against). Factors might include properties of sentence processing, learnability, or social factors (social networks, prestige, or identity). Note that we are not claiming that drift is a theory of language change by itself but, rather, that random processes are everpresent and must be controlled for in developing a theory of language change or, more broadly, social and cultural evolution.

A theory of the random processes associated with language change would provide a *neutral model* of language, one where changes in frequency are solely due to stochastic processes. We could then compare the statistical properties of a

### Robin Clark

given change with the predictions of the neutral model.<sup>4</sup> InNewberry et al. (2017), we tested drift by using techniques developed by Feder et al. (2014); the essential idea is relatively straightforward. Suppose we have a time series of frequencies of some variable trait. Starting at zero, we can keep a running sum by adding 1 if the frequency of the trait at a time step increases and subtracting 1 if the frequency goes down. We expect that the sums for drift should show a Gaussian distribution around 0. Indeed, we can estimate population size and test whether we can reject drift for various population sizes. Newberry et al. (2017) apply the technique to a number of different time series and argue that not only can we distinguish between drift and selection, but that we can quantify the strength of selection relative to population size. The method should, when applied to a broad array of different time series data, allow us to refine our theory of diachronic change.

So far, the reader may think that drift is a problem for the theory of language change; in fact, though, drift may also help us to understand how language variation can arise in the absence of language contact. If the population is finite, then random processes will guarantee that the variance will increase with time, as we have seen in our discussion of Moran processes around Figures 1.1–1.3. This, in turn, guarantees that new variants will constantly be brought into the population. In other words, variation can arise in the absence of language contact.

Clark & Kimbrough (2015) develop a simple mechanical model of language variation using a version of exemplar theory (Murphy 2002). The agents adapt their behavior by finding the centroid of a set of exemplars (in this case, a set of vowel pronunciations). If no other force is acting on the model, then the agents gradually find the same centroids. If, however, the model has more social structure, where some agents are designated as particularly influential, so that their productions are given extra weight by other agents, the variance grows enormously. The influential utterances, in fact, reduce the effective population size (Crow & Kimura 1970), since so many agents tend to imitate these utterances; in other words, social structure makes the population smaller, causing a large increase in the variance over time. This is, again, an example of how variation can arise spontaneously due to the statistical properties of small populations.

I began this chapter by recalling a puzzle that Ian Roberts and I had pondered years ago. We could see that language contact could trigger language change; I couldn't quite see how languages would change in the absence of contact, but surely (I thought) that must be possible. I now offer a hypothesis about another

<sup>4</sup>Recently, neutral models have begun to receive a great deal of attention, long overdue. See Baxter et al. (2006; 2009); Blythe (2012); Blythe & Croft (2012); Kauhanen (2017); Stadler et al. (2016) for an array of approaches.

### 1 Drift, finite populations, and language change

possible origin for language variation and language change: finite populations. I hope in future that we will be able to explore this hypothesis with empirical work in corpora, modeling with Agent-Based Models and experimental laboratory work.

## **References**


### Robin Clark


# **Chapter 2**

# **Rethinking complexity**

Susana Bejar University of Toronto

Diane Massam University of Toronto

Ana-Teresa Pérez-Leroux University of Toronto

Yves Roberge University of Toronto

> This paper addresses the nature of complexity of recursion. We consider four asymmetries involving caps on recursion observed in previous experimental acquisition studies, which argue that complexity cannot be characterized exclusively in terms of the number of iterations of Merge. While recursion is essentially syntactic and allowed for by the minimalist toolkit via Merge, selection, and labeling or projection, the complexity of recursive outputs arises at the interface.

# **1 Introduction**

Watumull et al. (2014) (WHRH) discuss three criterial properties of recursion and argue that "by these necessary and sufficient criteria, the grammars of all natural languages are recursive" (p. 1). Phrases and sentences are defined recursively "in a stepwise strongly generative process creating increasing complexity" (p. 6). We focus here on this notion of complexity, since, from the perspective that recursive structures are the result of repeated applications of Merge opera-

Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge. 2020. Rethinking complexity. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 15–24. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972830

### Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge

tions, structures arising from similar derivational steps should all be derivationally equally complex. This squib sheds light on the nature of the complexity of recursion in human grammar through a theoretically-based exploration of four asymmetries observed in a series of experimental studies on the acquisition of self-embedding structures we have conducted in the last few years. Note that, while "self-embedding" often refers to complement structures, our use of the term generalizes over adjunction as well.

WHRH emphasize that recursion is an architectural property of the language faculty as opposed to a characterization of output structures, pointing to two correlates of this view: (i) recursion is an architectural universal, not an emergent property; (ii) the caps on recursion that are observable in output structures result from arbitrary external factors. Here, our work contrasts with WHRH in two points. First, we investigate recursion as a property of outputs while what matters for WHRH is the complexity of the recursive procedure itself.<sup>1</sup> Second, we examine caps on recursion in child language as a window into development of the language faculty. Nonetheless, we seek to explore the links between our studies and the positions articulated in WHRH. In particular, we examine the connection between children's capacity to produce self-embedding structures and the notion of complexity. We argue that while recursion is essentially syntactic and allowed for by the minimalist toolkit, via Merge, selection, and labeling or projection (cf. Hauser et al. 2002), the complexity of recursive outputs arises at the interface.<sup>2</sup>

The growth of grammatical competence gives rise to the ability to produce longer and more complex sentences. Although there is little consensus about what constitutes complexity (Culicover 2013; Roeper & Speas 2014; Trotzke & Bayer 2015; Newmeyer & Preston 2014; McWhorter 2011), most discussions agree that embedding increases complexity (Culicover & Jackendoff 2006; Givón 2009). However, in the narrow syntax, embedding by itself should not determine complexity, as it is given by recursive Merge. We argue that complexity, rather than being strictly correlated with recursive iterations of Merge, arises at the interface. Moreover, because recursive iterations of Merge can result in different varieties of recursively embedded output structures, some structural elaborations turn out to be more complex than others.

<sup>1</sup>This is not to say of course that the issue of the complexity of the recursive program is of no interest but the goal of our research is to identify the source of the difficulties that complex structures create for children (and adults).

<sup>2</sup>The view that recursion is in narrow syntax we share with WHRH and many others (e.g. Moro 2008; Nevins et al. 2009); however, it has also been proposed to be in the discourse (Evans & Levinson 2009; Koschmann 2010), or a consequence of phasal architecture and the interface (Arsenijević & Hinzen 2010).

### 2 Rethinking complexity

At the outset, the language of young children does not include structurally elaborate expressions; various forms of structural elaboration emerge during the preschool years. Absence of a structure leads to the attribution of the property of complexity to that structure, but often without a clear notion of what complexity is. Here, we discuss four aspects of complexity in recursive structures that present challenges for a simple definition. We consider these issues in the context of recursive NP embedding, including conjunction, genitives, PP structures, and relative clauses. In previous work (Pérez-Leroux et al. 2012; Pérez-Leroux, Castilla-Earls, Bejar, Massam & Peterson 2018; Pérez-Leroux, Peterson, et al. 2018) we observed that recursive conjunction seems simpler than recursive PP modification, that sequential double modification is less complex than twice-embedded modification, and that the combination of relative clause and PP modification is somehow less complex than twice-embedded PP modification, at least in some of the languages studied. From this, we argue that complexity is not uniform, and that the complexity emerging from recursive embedding is a property of the interface, and not a property of narrow syntax.

We now turn to a discussion of four contrasts that shed light on the nature of complexity.

## **2 Coordination and modification**

Children learn the basic ingredients required for NP elaboration quite early, including relevant functional elements (Brown 1973) and semantic relations (Bloom et al. 1975). Pérez-Leroux et al. (2012) investigated the points when children learn to iterate forms of NP elaboration. Using a referential task, we elicited twiceembedded genitives (1a) and modificational PPs (1b). Contexts were set up so twice-embedded modification was needed to disambiguate target referents from other competing referents. For instance, we need something like (1b) to uniquely describe the target in a scenario with two girls, each with a dog, where the only difference is a hat on one of the dogs. We controlled for whether children could produce utterances with three NPs, by testing coordination, as in (2), which matched the utterance length of the recursively embedded conditions.

	- b. the girl with a dog with a hat

Of key importance is the following result: children had no difficulties producing coordinate NPs, but had substantial difficulties with NP embedding. Two-

### Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge

thirds of the younger children produced no NP embedding at all. This does not follow from current assumptions about coordination structures. Recently, the goal has been to integrate coordination into X-bar theory (contra, e.g. Jackendoff 1977), whether by adjunction (Munn 1993) or complementation (Johannessen 1998). Under this approach, coordinates are structurally equivalent to either of the twice-embedded structures in (1). This precludes a purely structural explanation of the relative difficulty of the PP and genitive recursive structures.

The NP embedding/coordination contrast is thus placed squarely in the domains of processing and/or semantics, i.e. interpretive complexity at the interface. Coordinating three NPs just augments a set. Embedding, via either adjunction or complementation, reformulates the description of a set. The descriptive content of lower referents serves to restrict the domain of the higher nominal.

## **3 Sequential and recursive PP modification**

A subsequent study explored the next logical question (Pérez-Leroux, Peterson, et al. 2018). Does each step in embedding increase the complexity of the nominal structure? We set up a minimal comparison between two types of doubly modified structures involving locatives, relying on a similar referential task to the one previously employed, but contrasting two types of contexts. One condition required two PPs modifying the same head noun as in (3a), whereas in the other (3b), the head noun is modified by a PP, itself modified by a lower PP.

	- b. the bird [ on the alligator [ in the water ]]

A detailed comparison of these two constructions reveals that, syntactically and semantically, they are equally complex, at least in principle. Their generation involves not only the same core operations (e.g. Merge, predicate modification), but also the same number of core operations. Given the formal parallels of the two constructions, we would expect comparable patterns of production. However, a strong asymmetry arises. Both children and adults produced twiceembedded PP modification at half the rates of double sequential modification. Since everything else is held constant, productivity can be interpreted as a reflection of less complexity. Given the comparability between the task and the structure, this suggests that depth of embedding results in more complex configurations. What might account for this difference? Again, we must look to the interface to explain this. Under the logic of phase theory, a phasally complete functional domain like DP should cease to function as a complex object (phase

### 2 Rethinking complexity

impenetrability condition, PIC). While (3a) and (3b) are equivalent with respect to the number of phasal domains (assuming one views DP as a phase), in (3b), but not (3a), the referent of the head noun is restricted by an expression that is inaccessible under the PIC. In fact, the descriptive content of the lower phase *in the water* in (3b) was essential for success in the experimental task: other alligators lurked on land. We submit that this is the source of the added complexity of these structures, but note that this is not complexity in the narrow syntax – the narrow syntax freely generates such structures – the challenge rests in interpretive requirements at the interface.

# **4 PP/relative clause modification and recursive PP modification**

A third observation in support of our view of complexity also originates from Pérez-Leroux, Peterson, et al. (2018). In lieu of the target PP modifiers (4a), speakers commonly substituted relative clauses (4b) and a mix of PP and relative clause constructions (4c).

	- b. The bird that's on the crocodile that's in the water.
	- c. The one on the one on the crocodile's eyes that was in the water.

That adults were prone to use the more elaborate relative clauses (RCs) where simple PPs would do the work was a surprise. That children did so too was more so, given the extensive literature on children's difficulties with relative clauses (see references in Friedmann et al. 2009; Givón 2009). Interestingly, these expansions were particularly frequent when the target was a twice-embedded PP structure. There, the relative and mixed PP/relative strategies represented over 40% of adults' and children's target responses. This was true in English as well as in recent data from German preschoolers, obtained with the same methods (Lowles 2016). These responses are perfectly natural, and certainly successful in the context of our task. From a complexity perspective, they are perplexing – especially in the case of children – inasmuch as they constitute longer and structurally more elaborate constructions that, importantly, do not informationally add anything when compared to PP responses. The additional syntactic and semantic complexity introduced by RCs is not limited to the additional lexical material but is also due to the fact that they involve displacement and dependencies in syntax as well as additional semantic operations. Yet their use strongly suggests that the modification relation is not problematic. This leaves us with a mystery: Why should

### Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge

children and adults frequently use the structurally more elaborate relative clause strategy to express modification?

If complexity is not computed in narrow syntax as the result of a number of recursive applications of Merge, then this result can be interpreted from a different angle. Several possibilities arise which differ with respect to how "detached" from the computational component the complexity issue really is. For instance, as early as 1963, Chomsky & Miller argued that the complexity of recursive selfembedding results from performance processes, not formal grammar. In contrast, Arsenijević & Hinzen (2012) note that instances of X directly dominating another instance of X are rare: the common strategy is for referential expressions to dominate others of the same type indirectly, via sequences of functional categories. For them this is a direct result of the phasal architecture of the computational component. Everything seems to function as if to create a structural contour between referential expressions in a phrase.

On a final note, our conclusion that complexity of recursive embedding does not reside in narrow syntax is supported by comparable data recently collected from French and Japanese (Bamba et al. 2016; Roberge et al. 2018). In these languages, children do not readily rely on the relative clause strategy; they incorporate it gradually, as one would expect. One possible explanation route is to link this cross-linguistic difference to uniformity in the directionality of embedding: French and Japanese are uniformly right- and left-embedding, whereas German and English mix branching directionality in their nominal syntax. If this is confirmed by further studies on additional languages, we would conclude that recursive PP embedding is not computationally more complex than any other applications of Merge and avoidance of twice-embedded PPs in our experiments must be accounted for by recourse to other considerations.

## **5 Genitives and PPs**

The cases discussed so far implicitly follow a quantity metric, comparing the target structures in the two types of double-modification contexts with respect to the number of noun phrases, embedding steps, layers of functional structure, and steps required for semantic derivation. Let us now turn to qualitative differences. Do different types of NP embedding yield differences in complexity for reasons unrelated to structural metrics? Here we focus on possessive embedding (1a), which differs from comitative PPs (1b) in terms of directionality and case marking. Again, on minimalist assumptions about recursion, the answer should be no. However, accounts of acquisition difficulties often rely on notions of uniformity, and the basic typology of the target language. It is conceivable that in

### 2 Rethinking complexity

English, a fundamentally right-branching and analytic language, the genitive *'s* construction might be constrained in acquisition. It is, after all, constrained in related languages. Roeper & Snyder's (2004) observation that the cognate possessive form in German does not iterate (i.e., German allows *NP's NP* but not *NP's NP's NP*) was the starting point in the study of the acquisition of recursive self-embedding structures. Such language differences prove that rule acquisition (i.e., possessive *-s*, in this case) is a learning step distinct from the acquisition of rule iteration (allowing multiple instances of the embedding process). The data in Pérez-Leroux et al. (2012) suggested a delay. First-level embedding appeared simultaneously for genitives and PP modifiers. Second-level of embedding was a distinct stage, attained first for PP modifiers. Since few children attained the second stage in the development of complex NPs, this was clearly worth further investigation. We recently elicited data on the production of recursive possessives and PPs in a group of seventy-one English-speaking children in Toronto (Pérez-Leroux et al. in preparation). While overall rates of production success were slightly higher for recursive comitative PPs, children did not acquire them earlier than genitives. In fact, the converse was true. Individually, more children could produce recursive sequences of possessive *-s* than of comitatives (*NP with NP with NP*) at a ratio of 5 to 1 compared to the converse pattern. This is due to the PP/RC trade-off described in §4. Possessives were rarely substituted by other forms, so a child could more easily embed possessives twice. We can safely conclude that the structurally distinct properties of the possessive construction do not constrain children's ability to iterate genitive embedding.

## **6 Conclusion**

The notion of complexity – often loosely defined and used intuitively – is illuminated by the consideration of caps on recursion as observed in acquisition studies. Four cases were discussed, all pointing to the conclusion that complexity cannot be characterized exclusively in terms of the number of iterations of Merge. In closing, we return to WHRH and the view of recursion articulated therein. WHRH take complexity to correlate with iterations of the recursively defined generative structure-building procedure, with caps on recursion/complexity reducing to (arbitrary) extra-linguistic considerations. Couched in the traditional dichotomy, their focus is on competence. We argued that this view of complexity does not shed light on the nature of caps on recursion observed in the language acquisition studies reported here. However, we believe our results are consistent with the overall view of recursion articulated in WHRH. The absence of a correlation between complexity and recursive iterations of Merge is exactly what one

### Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge

might expect if the recursive nature of grammar is an architectural universal and hence unlearnable (as WHRH say). Likewise, WHRH's view that caps on recursion/complexity must be understood in terms of conditions external to narrow syntax resonates with our findings, though it is not at all clear to us how external (or arbitrary) these really are. Our studies point to the need for future work to determine and articulate the nature of complexity at the interface.

# **Abbreviations**


## **Acknowledgements**

Authors are listed in alphabetical order. We are grateful to Anny P. Castilla-Earls, Erin Hall, Gabrielle Klassen, Erin Pettibone, Tom Roeper, Petra Schulz, Ian Roberts and two anonymous reviewers for helpful comments and discussion. We also gratefully acknowledge funding from the Social Sciences and Humanities Research Council of Canada (IG 435-2014-2000 "Development of NP complexity in children" to Ana T. Pérez-Leroux & Yves Roberge).

# **References**


### 2 Rethinking complexity


Susana Bejar, Diane Massam, Ana-Teresa Pérez-Leroux & Yves Roberge


# **Chapter 3**

# **From macroparameters to microparameters: A Bantu case study**

Jenneke van der Wal Leiden University

> Crosslinguistic variation in the Bantu languages provides evidence for a more finegrained model of parameter setting, ranging from macro- via meso- and microto nano-parametric variation, as proposed by Biberauer & Roberts (2015a). The various sizes of parametric variation in Bantu are discussed for word order, verb movement, ditransitive symmetry, locatives, and φ indexing in the clause. Taking a Minimalist featural perspective, the resulting emergent parameter settings and hierarchies are motivated by third-factor principles. The paper furthermore shows how macrovariation does not equal macroparametric variation.

# **1 Parametric variation**

In a Minimalist approach to syntactic variation, the variation is often assumed to be located in the lexicon, since the items in the lexicon need to be learned anyway, be they of a lexical or functional nature. This basis of parametric variation is captured in the Borer–Chomsky conjecture (Baker 2008: 3, cf. Borer 1984; Chomsky 1995), building on the lexical parameterization hypothesis (Manzini & Wexler 1987) and the functional parameterization hypothesis (Fukui 1995):

(1) All parameters of variation are attributable to differences in the features of particular items (e.g. the functional heads) in the lexicon.

This entails that parameter settings involve, first, the selection of which formal features are present in the grammar of a language, and second, where in the language these features manifest themselves. This creates natural dependency

Jenneke van der Wal. 2020. From macroparameters to microparameters: A Bantu case study. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 25–60. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972832

Jenneke van der Wal

relations, which can be captured in parameter hierarchies – the backbone of the ReCoS project as proposed by Ian Roberts (see Roberts & Holmberg 2010; Roberts 2012 and much work in collaboration with other members of the project). An example is the hierarchy for word order (Roberts 2012), assuming that the default is for languages to be head-initial (Kayne 1994) and that head-finality is triggered by a feature moving the complement to the specifier of the head containing the feature (see further in §2.1):

(2) Word order parameter hierarchy (Roberts 2012):

There are two main conceptual motivations for exploring this hierarchical model of parameterisation. First, organising parameters in a dependency relation – rather than postulating independent parameters – drastically reduces the number of possible combinations of parameter settings, i.e. the number of possible grammars, as shown by Roberts & Holmberg (2010), Sheehan (2014), and Biberauer et al. (2014).

Second, the parameter hierarchy can serve to model a path of acquisition that is shaped by general learning biases (a component of the "third factor" in language design, Chomsky 2005). Biberauer & Roberts (2015a; 2017) suggest that two general learning biases combine to form a "minimax search algorithm":

	- a. feature economy (FE) Postulate as few features as possible to account for the input.
	- b. input generalisation (IG) If a functional head sets a parameter to value v<sup>i</sup> then there is a preference for all functional heads to set this parameter to value v<sup>i</sup> (a.k.a. "maximise available features")

By FE, the first parameter is always whether a feature is present/grammaticalised in a language at all (cf. Gianollo et al. 2008). If there is no evidence for the

### 3 From macroparameters to microparameters: A Bantu case study

presence of the feature, this first question will not even be asked. If there is evidence, a formal feature is posited, and by IG the feature is taken to be present on all heads. Only if there is counterevidence in the primary linguistic data (PLD) for this omnipresence will an acquirer postulate new categories and ask more specific questions about the distribution of the feature, i.e. on which subset of heads the feature is present. We thus derive a "none-all-some" order of implicational parameters and of parameter acquisition, as represented in (4). Parameters in this system are thus an emergent property of the grammar; see Biberauer & Roberts (2015a; 2016), Biberauer (2017a,b; 2018) , and Roberts (2019) for a full explanation of this emergent parameter setting.

(4) F present? yes: yes no: no all heads? which subset of heads?

This none > all > some acquisition creates a hierarchy that we can think of as ever more specified (i.e. featurally rich) parameters. In "size" terms, Biberauer & Roberts (2015a; 2016) propose the following taxonomy of parameters:

### (5) Types of parameters

For a given value *v<sup>i</sup>* of a parametrically variant feature F:


These parameter settings are said to have consequences for typology, acquisition, and diachrony. True macroparameters sit at the top of the hierarchy, determined by the complete absence or omnipresence of a feature. Typologically, the subsequent parameter settings have longer and more complex featural descriptions (since the descriptions are essentially aggregates of prior parameter settings), indicative of increasingly more marked grammatical systems. In terms

### Jenneke van der Wal

of acquisition, the higher parameters need to be set before lower parameters can be, which means that the further down the hierarchy a parameter is, the further it is expected to be along a learning path.

A conceptual motivation for the various sizes of parametric variation has thus been presented in the work by Biberauer and Roberts, but there remains a need for empirical evidence for these size differences. Biberauer & Roberts (2012; 2016) and Ledgeway (2013) form a good start, and the first goal of the current paper is to show that the different sizes of parametric variation are empirically verifiable in the Bantu languages, allowing a clearer insight into the nature of cross-Bantu variation, and a finer-grained discussion of how languages differ parametrically.

A second goal of the current paper is to show how parameter setting sizes need to be distinguished from geographical and genealogical "sizes" of variation. This is an important distinction that is not always made explicit: there is a difference between *sizes of variation* and *sizes of parameter settings*. The terms "macrovariation" and "microvariation" are standardly used when referring to comparative differences in a respectively larger or smaller geographical area, or at a respectively higher or lower level of genealogical relations. For example, one might talk about macrovariation between Algonquian vs. Sinitic languages (e.g. for polysynthetic vs. analytic morphology), or microvariation among northern Italian dialects. Given the relative robustness and stability of higher parameters with respect to lower parameters, we expect the variation in parameter size to go together with this geographical and genealogical variation. Logically speaking, however, the two are distinct. For example, if the presence of the feature uCase is one of the parameters, then it can be set as a macroparameter: either DPs need to be licensed or they do not.<sup>1</sup> Diercks (2012) shows that some Bantu languages do not show evidence for the presence of uCase, essentially setting the first parameter in this potential hierarchy to "no": uCase features are not present. In contrast, I show that at least the Bantu languages Makhuwa and Matengo do show evidence for the presence of abstract Case (van der Wal 2015), which again appears to be set as a macroparameter for the whole language. This means that we find both macroparametric settings ("no" and "all") in different Bantu languages. Although this is a variation in macroparametric settings, it would not typically be described as macrovariation, since it concerns variation within a subfamily.

<sup>1</sup>Halpert (2012; 2016) and Carstens & Mletshe (2015) suggest that even if uCase is absent on T (no evidence for nominative/subject case), there might still be a requirement for nominals in the lower domain to be licensed. Halpert claims that bare nominals can be Case licensed either inherently if they have an augment (K) or by a clause head while in the vP domain; Carstens & Mletshe propose semantic Case licensing by a low Focus head along with a value for [Focus]. This suggests a micro setting for the Case parameter in these languages.

### 3 From macroparameters to microparameters: A Bantu case study

With this background and these aims, the rest of the paper illustrates the various sizes of parametric variation across Bantu. §2 exemplifies each parameter size (from macro to nano) from different domains: word order, verb movement, symmetry in double objects, and locatives. §3 focuses on one domain, φ feature indexing, and attempts to establish a parameter hierarchy, capturing the variation as found in the Bantu languages, and exploring the nature of parameter hierarchies in the process.

# **2 One size does not fit all: Bantu illustrations of parameter sizes**

### **2.1 Macro setting: Word order parameter**

Under the assumption that head-initiality is the basic parameter setting (Kayne 1994), head-finality can be seen as the presence of a movement feature triggering "roll-up" movement. This feature can then be present on no heads, all heads, or a subset of heads, as already referred to above. The Bantu languages are almost all straightforwardly head-initial in all domains: initial complementisers, aux-V order, V-O order, prepositions, and N-possessor order, as illustrated in (5).

(6) Swahili (G42, Lydia Gilbert, p.c.)<sup>2</sup> A-li-ni-ambia 1sm-pst-1sg.om-tell kwamba comp a-ta-enda 1sm-fut-go ku-nunua inf-buy mkate 3.bread bila without mfuko 3.bag w-a 3-conn wazazi. 2.parents 'S/he told me that s/he would go to buy bread without her parents' bag.'

In a parameter hierarchy for word order as in (2) above, the Bantu languages overall are in the initial state: no head-final features. In acquisition this means that the parameter is left as unspecified, since there is no evidence whatsoever in the PLD that would trigger an acquirer to even consider the presence and spread of the feature.

In this case, a macro-setting for the word order parameter happens to also be associated with macro-variation, in the sense that there is not much variation *within* the Bantu language family but only on a macro-level of comparing language families.

<sup>2</sup>Bantu languages are classified with a letter (region) and number (language), according to Maho's (2009) update of the original classification by Guthrie (1948).

Jenneke van der Wal

However, there are tiny patches of head-finality to be found here and there. Two examples are O-V order in Tunen, and final question particles in languages like Rangi<sup>3</sup> and Zulu. While these languages are otherwise head-initial, they show head-finality in some restricted areas of the grammar.

Tunen is one of very few Bantu languages in which the direct object typically precedes the verb (7a). Only when the object is (contrastively) focused will it follow the verb, and in addition be marked with a contrastive particle *á* (7b).

	- a. Àná 3sg.pst mònɛ money índì. give 'S/he gave money.'
	- b. Àná 3sg.pst índì give á ptcl mònɛ. money 'S/he gave MONEY.'

Final particles form another example: while complementisers typically precede the clause they embed, question particles in Zulu and Rangi are clearly clause-final, evidencing a high interrogative-related projection (cf. Buell 2005; 2011).

	- a. U-Sipho aug-1.Sipho u-ya-yi-thanda 1sm-dj-9om-love lo-mculo. 3.dem-3.song 'Sipho likes this song.'
	- b. U-Sipho aug-1.Sipho u-ya-yi-thanda 1sm-dj-9om-love lo-mculo 3.dem-3.song na? q 'Does Sipho like this song?'

<sup>3</sup>Rangi (like some surrounding languages) is famous for its main clause V-aux order in two future tenses (Gibson 2016), which is an instance of head-finality too. However, the fact that the object still follows the auxiliary argues against roll-up movement, and thus against an analysis as involving the same feature. Furthermore, the strict adjacency required between the infinitival verb and the auxiliary, as well as the fact that clauses with a filled C-domain (relative, cleft, wh, focus) require aux-V order, argues in favour of V-aux as a derived by phrasal movement of only the infinitive to the specifier of the aux, rather than a full comp-to-spec movement.

3 From macroparameters to microparameters: A Bantu case study

	- a. Ma-saare 6-words y-áányu 6-your mwi-ter-iwre 2pl.sm.pst-listen-pfv.pass ʉʉ? q 'Were your words listened to?'
	- b. Nɨ cop w-arɨ 14-stiff.porridge w-óó-sáák-a 2sg.sm-prog-want-fv úry-a eat-fv wʉʉ? q 'Is it stiff porridge that you want to eat?'

While the typical Bantu acquirer generally does not pay any attention to the word order parameter and happily leaves the "no" setting intact, the illustrated phenomena provide potential input to the Rangi or Tunen acquirer that the "no" setting is not quite right. It is also clear that not all heads are head-final (skip macro), and that the verbal domain is not head-final in its entirety either (skip meso), which means that the head-final feature is at most only present on a subclass of heads, i.e. a micro-setting. Specifically for Tunen, it seems to only be present on V,<sup>4</sup> and in Rangi and Zulu only on a head in the high discourse domain of the clause.

### **2.2 Meso setting: Clausal head movement**

Another point where Bantu languages do not seem to vary internally is the template of verbal morphology and the structural position of the verb stem. Bantu verbs consist of a root with inflectional prefixes and (mostly optional) derivational suffixes, ordered as in the simplified template in Figure 3.1. 5


Figure 3.1: Slots in the Bantu verb

<sup>4</sup>This is likely a subset of V that c-selects for a DP object, as for example CP complements still follow the verb.

<sup>5</sup> Some Bantu languages also use tense, aspect, mood (TAM) inflections suffixes. An anonymous reviewer points out that TAM marking in Chimwiini is prefixal in general, with the exception of the past tense, which is a suffix. Whether this exception is due to a syntactic nano-parameter or a different morphological specification remains a question for further research.

### Jenneke van der Wal

This verbal morphology provides clear clues as to its underlying syntax. The most attractive structural analysis of this verbal structure is, following Myers (1990), Julien (2002), Kinyalolo (2003), Carstens (2005) and Buell (2005), and drawing on the explanation in van der Wal (2009), that the verb starts out as a root and incorporates the derivational and inflectional *suf* fixes by head movement in the lower part of the clause. It then terminates in a position lower than T. The inflectional *pre*fixes on the verb represent functional heads spelled out in their base positions. The (derived) verb stem and prefixes form one word by phonological merger. See Julien (2002) for the more elaborate argumentation.

To illustrate and argue for this derivation, consider first the Makhuwa example in (10) and the proposed derivation in (11). The verb stem *-oon-* 'to see', headmoves to CausP and incorporates the causative morpheme to its left: *-oon-ih-*. This combined head moves on to ApplP, incorporating a further suffix to its left: *-oon-ih-er-*. The next step adds the passive morpheme to form *-oon-ih-er-iy-* and this complex moves once more to add the final suffix, which has been posited in an aspectual projection just above vP. Crucially, these are all suffixes, and they surface in reversed order of structural hierarchy (Baker's 1988 mirror principle).

(10) Makhuwa (P31, van der Wal 2009: 168–169) nlópwáná 1.man o-h-oón-íh-er-íy-á 1sm-pfv.dj-see-caus-appl-pass-fv epuluútsá 9.blouse 'the man was shown the blouse'

### 3 From macroparameters to microparameters: A Bantu case study

One might expect the verb to move even higher (v-to-T-to-C), but there is no reason to assume that a moved head will first incorporate morphemes to its right (the derivational extensions and final inflectional suffix) and then to its left (the agreement and TAM markers). Therefore, the fact that inflectional morphemes surface as prefixes strongly suggests that these are not incorporated into the verb in the same way as the derivational suffixes, and thus that the verb has not headmoved further in the inflectional domain. The prefixes do form one phonological unit with the verb stem, but are posited as individual heads that attach to the rest of the verb by phonological merger only.

Another argument for this analysis is found in the order of the prefixes. If the inflectional prefixes were also the result of head movement, like the suffixes, they are expected to surface in the opposite order. This is indeed what we find in French, where there is independent evidence that the verb moves to T: the inflectional morphemes appear in the reverse order of the Makhuwa inflectional prefixes (12), and they appear as suffixes on the verb in (13).

(12) Makhuwa (P31, van der Wal 2009: 169) kha-mw-aa-tsúwéla neg-2pl.sm-ipfv-know 'you didn't know'

(13) French nous 1pl.pron aim-er-i-ons love-irr-pst-1sg 'we would love'

The verbal morphology thus provides evidence for head movement of the verb in the lower part of the clause to a position just outside of vP, with the prefixes spelled out in their individual positions in the inflectional domain above vP/AspP. Assuming with Roberts (2010) that head movement is triggered by features on heads (and a subset relation of the features of the goal with respect to its probe), then in featural terms, Bantu verbal movement can be accounted for by the distribution of this feature in the lower part of the clause only. More precisely: only the heads in the lower phase trigger head movement, but not the higher phase: a mesoparametric setting (see also Ledgeway 2013 and Schifano 2015 for a parametric account of variation in height of verb movement in Romance).

Coming back to the distinction between macrovariation and macroparametric variation, notice that the vast majority of the language family displays this

<sup>6</sup>The passive morpheme can also reside in a higher VoiceP; for the current point it does not make a difference.

Jenneke van der Wal

"halfway" head movement. This is an invariant "macro" fact about Bantu *crosslinguistic* (non-)variation that nevertheless clearly is at a meso-level of *parametric* variation, illustrating again that these notions should be kept apart.

### **2.3 Micro setting: (A)symmetrical double objects**

Ditransitives in Bantu languages show crosslinguistic variation as well as language-internal variation in the behaviour of the two internal arguments. Bresnan & Moshi (1990) divided Bantu languages into two classes – symmetrical and asymmetrical – based on the behaviour of objects in ditransitives: languages are taken to be symmetrical if both objects of a ditransitive verb behave alike with respect to object marking and passivisation (see Ngonyani 1996; Buell 2005 for further tests). In Zulu, for example, either object can be object-marked on the verb (14), making this a "symmetrical" language.<sup>7</sup>

	- a. U-mama 1a-mama u-nik-e 1sm-give-pfv aba-ntwana 2-children in-cwadi. 9-book 'Mama gave the children a book.'
	- b. U-mama 1a-mama u-**ba**-nik-e 1sm-**2om**-give-pfv in-cwadi 9-book (aba-ntwana). 2-children 'Mama gave them a book (the children).'
	- c. U-mama 1a-mama u-**yi**-nik-e 1sm-**9om**-give-pfv aba-ntwana 2-children (in-cwadi). 9-book 'Mama gave the children it (a book).'

Conversely, in asymmetrical languages only the highest object (benefactive, recipient) can be object-marked; object-marking the lower object (theme) is ungrammatical.

(15) Swahili (G42)

a. A-li-**m**-pa kitabu. 'She gave him a book.'

<sup>7</sup>One should, however, be careful in characterising a whole language as one type, since it has become more and more evident that languages are usually only partly symmetrical (Schadeberg 1995; Rugemalira 1991; Thwala 2006; Ngonyani 1996; Ngonyani & Githinji 2006; Riedel 2009; Baker 1988; Alsina & Mchombo 1993; Simango 1995; Zeller & Ngoboka 2006; Jerro 2015; van der Wal 2017, etc.).


Following Haddican & Holmberg (2012; 2015), I propose in van der Wal (2017) that symmetry in Bantu languages derives from the ability of lower functional heads like the Applicative to (Case) license an argument either in its complement or in its specifier, as in (16) and (17).

(16) v agrees with BEN (and can spell out as Benefactive object marker)

(17) v agrees with TH (and can spell out as Theme object marker)

In asymmetrical languages, Appl always licenses the theme and (16) is the only possible derivation, whereas in symmetrical languages Appl is flexible in licensing either argument (and either derivation in (16) and (17) is possible). The features involved in flexible licensing are discussed in van der Wal (2017), but for the current discussion it suffices to take this licensing flexibility to account for the difference between asymmetrical and symmetrical languages.

### Jenneke van der Wal

However, within these "symmetrical" languages, there is variation in which low functional heads are flexible. That is, lexical ditransitives, applicative verbs and causative verbs differ in symmetry, across and within languages. For example, in Otjiherero the lexical ditransitive (not shown) and applied verb (18) behave symmetrically for object marking, but causatives are asymmetrical, only allowing object marking of the causee and not the theme (19).

	- a. Má-vé prs-2sm **vè** 2om tjáng-ér-é write-appl-fv òm-bàpírà. 9-letter 'They are writing them a letter.'
	- b. Má-vá prs-2sm **ì** 9om tjáng-ér-é write-appl-fv òvà-nâtjé. 2-children 'They are writing the children it.'
	- a. Ma-ve prs-2sm **ve** 2om tjang-is-a write-caus-fv om-bapira. 9-letter 'They make them write a letter.'
	- b. \* Ma-ve prs-2sm **i** 9om tjang-is-a write-caus-fv ova-natje. 2-children 'They make the children write it.'

This means that flexibility is present only in a subset of functional heads in the lower phase, i.e. a microparameter, and within that subset we can distinguish even further microparameterisation, for example only applicative but not causative in Otjiherero. Moreover, there appears to be an implicational relation as to which types of ditransitives show symmetrical object behaviour (van der Wal 2017), shown in Table 3.1.

How can this relation be accounted for? Following Pylkkänen (2008), I take the lexical ditransitive to involve a low applicative head (LApplP) under V. Applicative verbs contain a high applicative head (HApplP) between V and v, and Causative verbs have a causative head (CausP) above HApplP, either between V and v or above a second little v (see further Pylkkänen 2008 on different heights of causatives). The pattern in Table 3.1 can then be understood as an implicational relation between low argument-introducing heads, such that if a relatively higher head is flexible (= shows symmetrical object behaviour), lower heads do so too.

### 3 From macroparameters to microparameters: A Bantu case study

Table 3.1: Implicational relation in ditransitive symmetry


For our parameters, this means that within the microparametric subset of heads (namely, Case licensing functional heads in the lower phase of the clause), the none-all-some pattern introduced above re-appears:

(20) Parameter hierarchy for (a)symmetry in ditransitive alignment (adapted from van der Wal 2017)

Are low functional heads flexible in licensing?

This falls out naturally if we acknowledge that the creation of a subset type results from specifying an additional formal feature. Following the same logic, the basic questions inspired by FE and IG apply to these features too: the feature is only postulated if there is evidence (is it present → yes: create subset), it is then assumed to be present in the whole subset (*all* heads within the subset), and only if there is further evidence that it is not present for all heads in the subset is a further subset created (defined by another feature). The scope of the parameter settings is thus growing smaller and smaller the further down the hierarchy a parameter is, but the mechanism stays the same. The microparameter for object symmetry illustrated here applies only to the object domain and can therefore be considered relatively small – indeed, it is microparametric variation — which in this case also equals a geographical and genealogical size of microvariation.

<sup>8</sup>This is a theoretical possibility, representing flexible licensing that is sensitive to other factors.

Jenneke van der Wal

### **2.4 Nano setting: Locatives**

Coming to the smallest size of parametric variation, it is important to note that nanoparameters should be distinguished from what Biberauer & Roberts (2015b: 9) call "parametric fossils". Syntactic parameters, however limited their scope might be, still have effects in the syntax rather than just affecting the morphology (as is the case for example in irregular past tenses that have no syntactic effect). Such a nano-parametric syntactic parameter setting can be found in Tswana locatives.

The Bantu noun classes include a number of locative classes, most commonly the classes traditionally numbered 16–17–18 and sometimes 23 (Meeussen 1967). In Chichewa, for example, the class 18 prefix *mu*- derives a locative DP (with meaning "inside") from a noun in a non-locative class, as shown in (21).<sup>9</sup> The DP status can be seen in the locative's ability to control subject agreement and object agreement on the verb. However, not all languages retain locatives as a part of the noun class system, as there is variation in the categorial status of locatives. In some southern Bantu languages locative DPs have undergone the "great locative shift" (Marten 2010), reanalysing the locative prefix as a preposition. Locatives are thus PPs in these languages, as illustrated for Zulu in (22).

	- a. Ndí-ma-**ku**-kóndá 1sg.sm-prs.hab-17om-love ku 17 San San José. Jose 'I like (it) (in) San José.'
	- b. Mu-nyumba 18-9.house **mu**-na-yera. 18sm-pst-white 'Inside the house is clean.' [DP [nP mu [NP nyumba]]

(22) Zulu (S42, Buell 2007)

Ku-lezi 17-10.these zindlu 10.houses ku-hlala expl-stay abantu 2.people abakhubazekile. 2.handicapped 'In these houses live handicapped people.'

[PP ku [DP lezi [NP zindlu]

<sup>9</sup>Carstens (1997) analyses locative DPs as null locative nouns taking a KP complement, of which K agrees with the locative and spells out as the locative prefix. The reanalysis to a PP then concerns the loss of null locative nouns, leaving the KP/PP. Regardless of the precise analysis of locatives (locatives as a nominal derivation by means of nP being another possibility – see Fuchs & van der Wal 2019), the process of change from DP to PP and the relics in this area illustrate a nanoparametric setting.

### 3 From macroparameters to microparameters: A Bantu case study

Riedel & Marten (2012) show that there is a continuum for Bantu locatives, ranging from a fully operative three-way (or more) distinction between the different locative noun classes, on nouns as well as agreement markers, to a completely reanalysed PP-based locative system and a reduced verbal agreement paradigm (Demuth & Mmusi 1997; Creissels 2011). Towards the latter end of this spectrum is Setswana, where locative noun classes have been lost, leaving behind some "relics". Only some prepositions show class 16 or 18 morphology (23a), and only two nouns are inherently in locative classes (23b,c).

	- a. class 18 *mo-rago ga* 'behind'
	- b. class 17 *go-lo* 'place'
	- c. class 16 *fe-lo* 'place'

Crucially, *golo* and *felo* are not just lexicalised locatives that are otherwise adjusted to fit a system without any formally locative arguments, but they still trigger true class 17 locative agreement on the verb, according to Creissels (2011). This is important because it means that there still is a syntactic parameter to be set, rather than the variation being "fossilised" and purely lexical.

The fact that this syntactic property is restricted to only two lexical items makes it of a nano-parametric size. This is a fragile but interesting stage of a parameter – unless the lexical items are highly frequent, there is little chance that acquirers of the language will have enough input to be able to pick it up. This means that either the property will spread through the language and remain part of the system, or that it disappears, essentially catapulting the language right to the top of the relevant hierarchy, back to the "none" setting (cf. Biberauer & Roberts 2016). The Tswana locatives seem to be on their way out, as Creissels (2011: 36) notes that "Tswana speakers tend to regularize the situation by using *lefelo* (class 5, plural *mafelo*) instead of *felo* [class 16, JW]". This effectively reanalyses the noun class of the last remaining inherently locative nouns, leading to the loss of the productive noun class.

In summary, I have presented evidence for more fine-grained parametric distinctions ranging from macro- to nano-parameters from various domains of Bantu syntax. One of the challenges in the ReCoS research programme is to see how the different sizes of variation all link up in one hierarchy, which is the topic of the next section on φ feature indexation.

Jenneke van der Wal

## **3 Variation in the distribution of φ features**

Bantu languages tend to be head-marking in the clause, often displaying subject and object marking, as well as complementiser agreement, but again there is cross-Bantu variation. A closer look at this parametric variation in φ feature indexation turns out to be interesting, both from an empirical and a conceptual point of view. An attempt at establishing one parameter hierarchy for uφ features is shown to be problematic, but problematic in an insightful way.

### **3.1 Where we see uφ features**

In the current framework, φ feature indexation is taken to be a reflection of an Agree relation between a Probe and a Goal (Chomsky 2000; 2001). In Probe– Goal agreement, a head with an uninterpretable feature (uF), called the Probe, searches its c-command domain for valuation by the closest constituent with a matching interpretable feature (iF), the Goal. I assume that subject marking on the verb indicates the presence of a full uφ feature specification on T, and that object marking is due to uφ on little v. I take a hybrid approach to object marking as Agree with a defective goal (Roberts 2010; Iorio 2014; van der Wal 2015), which entails that all object marking, be it pronominal (non-doubling) or grammatical (doubling), involves a φ probe. The presence of uφ features on a higher head like C results in agreeing complementisers or separate relative markers on the verb (Carstens 2003; Henderson 2011, among others). Finally, I propose that the presence of uφ features on lower functional heads such as Appl and Caus results in multiple object markers, illustrated in the example in (24) and structure in (25).<sup>10</sup> The sets of φ features on the heads in the lower part of the clause are "gathered" by head movement of the verb through the lower part of the derivation (see §2.2 above). As φ features differ from the derivational heads themselves, they are spelled out as prefixes on the verb (unlike the applicative, causative, passive, etc., which appear as suffixes).

	- a. Maama 1.mother a-wa-dde 1sm-give-pfv **taata** 1.father ssente. 10.money 'Mother has given father money.'

<sup>10</sup>I leave to one side how the Kinande "linkers" (Baker & Collins 2006; Schneider-Zioga 2015) fit into this model – it might be that there is a separate LinkerP head that has uφ features (as Baker & Collins 2006 propose).

3 From macroparameters to microparameters: A Bantu case study

b. Maama 1.mother a-zi-**mu**-wa-dde. 1sm-10om-1om-give-pfv 'Mother has given him it.'

### **3.2 First attempt at a hierarchy**

In setting the uφ parameters of a language, what needs to be established is whether the language makes use of uφ probes at all, and if so, on which heads these features are present. One can thus imagine a parameter hierarchy as in (26), following the now-familiar none–all–some sequence.

(26) Possible uφ feature hierarchy 1 (cf. Roberts & Holmberg 2010; Roberts 2012; 2014)

The first parameter asks whether uninterpretable φ features are present at all in the language. If the answer is "no", this could describe radical pro-drop languages (Saito 2007, Roberts 2010; 2012; 2014; 2019), which do not show any crossindexing and where this question will thus not even come up for the language

### Jenneke van der Wal

acquirer (sticking to FE). In contrast, verbal inflection in all Bantu languages shows at least some indexing, which means that it needs to be established how pervasive this feature is in each language.

By IG, the next parameter sets whether *all* probes have uφ. There is a question as to which heads are included in "all probes"; concretely, should both the nominal and verbal domain be considered? This is not the case for the null subject hierarchy for φ features as proposed by Roberts & Holmberg (2010), where only the clausal domain is considered. The acquisition logic of none-all-some, however, requires that the first "all" setting concerns undifferentiated categories (see Biberauer 2011; 2018, Bazalgette 2015, and Biberauer & Roberts 2017 on emergent parameters), which means that the whole domain – which is eventually split into nominal and verbal – should be considered at this macro stage. Setting this parameter to "yes" should result in agreement not just on C, T, v, and Appl but also P, D, Num, and Poss. While some Bantu languages may come close to the presence of uφ features throughout the language,<sup>11</sup> I do not know of any Bantu language showing φ agreement on prepositions,<sup>12</sup> so we need to inspect subtypes.

One step further down the hierarchy we ask whether uφ is present on a subset of heads, specifically all heads in the nominal or verbal domain. Since it may be the case that there is a relevant subset in *both* domains, we can see this as a split in a third dimension where parameters are set for the nominal domain [+N] separately from the verbal domain [+V], depending on the input. Focussing on the clausal domain for the current discussion, once the [+V] subset is identified, by IG it is assumed that all heads in the subset, i.e. all functional heads in the extended verbal projection, have uφ.

An example of a language where uφ features are generalised to occur on all (clausal) heads is Ciluba. Ciluba displays multiple object marking (i.e. uφ on v and Appl, in the system as introduced above), as well as subject marking (φ on T) and agreeing relative complementisers (φ on C). Object and subject marking are illustrated in (27); and (28) shows separate subject marking on the verb and relative agreement on the auxiliary.

	- 'The woman buys it (fruit) for him (the boy).'

<sup>11</sup>There is a question as to whether agreement and concord involve the same operation – see for example Giusti (2008) for discussion claiming that they are not.

<sup>12</sup>I take the Bantu connective *-a* 'of' to not be a true preposition (van de Velde 2013).

### 3 From macroparameters to microparameters: A Bantu case study

(28) Ciluba Kasai (L31, de Kind & Bostoen 2012: 104)


If not all heads in the clause have uφ, further parameterisation consists of establishing the next relevant subset where uφ is present. For the Bantu clausal domain, the next largest subset appears to be the argument-licensing heads: T, v, and Appl/Caus. Sticking with a standard view of Case licensing, this would come down to heads that have Case in a featural specification.<sup>13</sup> In Kinyarwanda, the verb famously displays multiple object marking (29) as well as subject marking, but not complementiser or relative agreement for φ features: the relative clause in (30) is formed by a high tone. This means that uφ is present on v and Appl, as well as T, but not on C. Kinyarwanda thus sets the parameter "Is uφ present on all argument-licensing heads?" to "yes", entailing that there is no uφ on C, since otherwise the language would have already been done setting its parameters at the previous question, i.e. all clausal heads have uφ.

(29) Kinyarwanda (JD61, Beaudoin-Lietz et al. 2004: 183) Umugoré **a**-ra-na-ha-**ki**-**zi**-**ba**-**ku**-**n-**someesheesherereza. 1.woman 1sm-dj-also-16om-7om-10om-2om-2sg.om-1sg.omread.caus.caus.appl.appl

'The woman is also making us read it (book, cl. 7) with them (glasses, cl. 10) to you for me there (at the house, cl. 16).'

	- a. U-mu-kózi aug-1-worker a-bar-a 1sm-count-fv i-bi-tabo. aug-8-book 'The worker counts books.'
	- b. i-bi-tabo aug-8-books u-mu-kózi aug-1-worker a-bar-á 1sm-count-fv 'the books that the worker counts'

<sup>13</sup>However, see the discussion on Case in §1 as well as Diercks (2012) and van der Wal (2015). Even if abstract as we know it does not play a role which is as influential in many European languages, there is still reason to believe that a nominal licensing constraint is at play universally, as we show in Sheehan & van der Wal (2016; 2018).

### Jenneke van der Wal

For all languages setting this parameter to "no", a further subset will be found, forming the next parameter. Within the argument-licensing heads, the next question is whether uφ is present on heads in the higher phase (i.e. v and T but not Appl). If the setting is "yes", the language has subject marking and only a single object marker, as illustrated for Makhuwa. Makhuwa shows extremely regular subject marking as well as object marking (all and only objects in classes 1 and 2 are marked, van der Wal 2009), but is restricted to one object marker (31), which means φ on T and v, but not on Appl.

(31) Makhuwa (P31)

Xaviéré 1.Xavier **o**-nú-**ḿ**-váhá 1sm-pfv.prs-1om-give anelá 1.ring Lusiána. 1.Lusiana 'Xavier gave Lusiana a ring.'

Makhuwa equally does not show agreement on C: complementisers never agree, and the relative construction in Makhuwa does not have a relative complementiser or relative agreement. Instead, it is best analysed as a nomino-verbal participial construction which does not have an agreeing C head (van der Wal 2010).<sup>14</sup>

(32) Makhuwa (P31, van der Wal 2010: 210) Ki-m-phéélá 1sg-sm-prs.cj-want ekanetá 10.pens tsi-ki-vah-aly-ááwé 10-1sg.om-give-pfv.rel-poss.1 (Alí). 1.Ali 'I want the pens that he (Ali) gave me.'

If the parameter setting is "no" for the presence of uφ features in the higher phase, then the language only has uφ on one head. This turns out to always be the highest in the subset left: uφ on T, i.e. only subject marking (see §3.3 below on the implicational relation for uφ on clausal heads). Basaa illustrates this parameter setting: it has a subject marker, which is written separately but is obligatory even in the presence of a full DP subject (33).

(33) Basaa (A43, Hyman 2003: 277) Liwándá friend jêm my **lí** sm m prs ! ɓéná do-often jɛ eat bíjɛk food í in !ndáp. house 'My friend often eats food in the house.'

<sup>14</sup>What seems to be a subject marker or relative marker on the relative verb in Makhuwa (*e-* and *tsi-* in the examples) is a pronominal head (PtcpP) coreferring to the referent indicated by the head noun, e.g. both refer to a class 9 shirt and therefore are both in class 9. There is no regular subject marking, but the subject can be pronominalised on the verb as a possessive (-*aawe*), showing that the relative clause is not a full clause but lacks higher heads in the extended verbal projection. See van der Wal (2010) for details.

### 3 From macroparameters to microparameters: A Bantu case study

Objects, however, are not marked on the verb, and when the object is pronominalised it simply appears as an independent pronoun following the verb (34b).

(34) Basaa (A43, Hyman 2003: 278)


Finally, relative clauses in Basaa can be marked with a demonstrative (*nu* in (35a) and *hi* in (35b), but Jenks et al. (2017) argue that this is not a C head.

(35) Basaa (A43, Jenks et al. 2017: 19, 20)


If Jenks et al. (2017) are correct in their analysis of the relative construction, then Basaa can be taken to illustrate a language in which only T has uφ features, whereas C, v and Appl do not.

The parameter hierarchy for Bantu languages discussed so far would thus come out as follows:

(36) Possible uφ feature hierarchy 2 (to be adjusted)

Jenneke van der Wal

### **3.3 (In)dependent parameters**

If this parameter hierarchy represents the typological picture, then it holds an implicational prediction such that if a language has uφ on one head in the following scale, it will have uφ on all the heads to its right (as noted for subject and object agreement by Moravcsik 1974, cf. Givón 1976; Bobaljik 2008):

(37) C comp/rel agr > > Appl multiple OM > > v OM > > T SM

Considering the sequence of heads in the verbal extended projection, it is clear that C is not in the expected position on this implicational hierarchy. And there are more indications that C is not quite in place in this hierarchy. For one thing, the evidence for the absence of uφ features on C in Makhuwa and Basaa is very much dependent on the theoretical analysis of relative clauses, which makes the argument for the absence of uφ on C in these languages less strong. Moreover, there is clear evidence from other Bantu languages that φ agreement on C must be independent of uφ on the argument-licensing heads. This is illustrated by Bembe, which shows the typical Bantu subject and object marking (40b, uφ on T and v), but does not allow more than one object marker (40c, no uφ on Appl). Non-subject relative clauses in Bembe can display a relative marker in addition to a pronominal subject marker (40), indicating that T and C both have their own set of uφ features.

(38) Bembe (D54, Iorio 2014: 103)

	- a. Baana 2.children ba-twa-mon-ilé 2rm-1pl.sm-see-pst ba-b-ile 2sm-cop-pst babembe. 2.Bembe 'The children whom we saw were Bembe.'

### 3 From macroparameters to microparameters: A Bantu case study

b. bilewa 8.food bi-ba-koch-ilé 8rm-2sm-buy-pst 'the food that they bought'

This cross-linguistic situation, as illustrated in Table 3.2, suggests that the presence of uφ features on C does not form part of the implicational hierarchy that holds between the argument-licensing heads T, v and Appl, which in turn suggests that uφ on C is a parameter that is independent of the parameter hierarchy for uφ features. See also Biberauer (2017a) and references cited therein on how C behaves differently from lower heads in the domain of word order as well.

Table 3.2: Implicational relation in uφ features


However, the implicational hierarchy *does* appear to hold for the argumentlicensing heads: if a language has uφ on Appl (multiple object marking) then it has uφ on v (single object marking), and if a language has uφ on v (object marking) then it has uφ on T (subject marking):

(40) Appl multiple OM > > v OM > > T SM

It is known that v's Case-assigning capacity can be dependent on T's (Marantz 1991; Baker 2015), and it is clear from the data surveyed here that the same holds for head-marking agreement (see Roberts 2014 on the same conclusion for Romance; and see, among others, Bobaljik 2008; Bárány 2015 for discussion on implicational relations between heads in the domains of Case and agreement). Additionally, based on the data surveyed for Bantu languages, this implicational relation can be extended to the lower functional heads such as Appl. The fact that these implications hold indicates that argument-licensing heads are a natural class, with a strong relation to φ feature agreement.

This suggests a revision of the parameter hierarchy that brings out the interdependence of argument-licensing heads, keeping C apart. In fact, it suggests

### Jenneke van der Wal

that variation in the presence of uφ features on C is a parameter that is not actually part of this hierarchy, since hierarchies are only attractive for modelling *dependent* parameters (as argued in the original ReCoS research proposal, see also Roberts & Holmberg 2010 and Sheehan 2014). The separate parameters can then can be modelled as in (41), representing only the dependent parameters in a macro-to-micro hierarchy:

(41) Dependent and independent uφ feature parameters

### **3.4 Further subsets**

Potential nanoparametric variation can also be attested in this domain, as exemplified by Luguru and Nyakyusa. These languages do display object marking, but only for some predicates. To illustrate with one example: in Luguru the verb *-bona* 'to see' requires an object marker and cannot be grammatically used without it, as shown for animate and inanimate objects in (42) and (43). The semantically similar verb *-lola* 'to see/look at', on the other hand, does not have this requirement and occurs without object marker (44).

	- a. Ni-w-on-a 1sg.sm.tns-2om-see-fv iwana. 2.children 'I saw the children.'

3 From macroparameters to microparameters: A Bantu case study

b. \* Ni-on-a 1sg.sm.tns-see-fv iwana. 2.children intended: 'I saw the children.'

(43) Luguru (G35, Marten & Ramadhani 2001: 264–265)


Marten & Ramadhani (2001) claim that this variation in predicates that do or do not require/allow object marking is not due to transitivity or the choice of object but individual predicates. Nevertheless, it seems that it can be modeled as variation in v's selection of a predicate taking an argument instead of an adjunct, i.e. microvariation. This would fit the difference between 'see X' (argument) and 'look at X' (non-argument). What is particularly suggestive in this case is the fact that the presence of an object marker can influence the interpretation of a predicate in Luguru. Marten and Ramadhani illustrate this with the predicate -*pfika,* which is usually interpreted as 'find, meet' when used with an object marker (45a), but as 'arrive' when there is no object marker (45b).

	- a. Wanzehe 2.elders wa-pfi-pfika 2sm.tns-8om-find ipfitabu. 8.books 'The elders found books.'
	- b. Wa-pfika 2sm-find ukaye house kwake. poss 'They have arrived at / been to his home.'
	- c. ? Wanzehe 2.elders wa-pfika 2sm-find ipfitabu. 8.books 'The elders arrived at the books' intended: 'The elders found books.'

### Jenneke van der Wal

Such a microparametric account seems less likely for Nyakyusa (M31), which has similar restrictions on object marking (Lusekelo 2012). Here too, the presence of uφ on v is not set for *all* v heads, and transitive predicates are in one of three groups according to their object marking abilities/possibilities (Amani Lusekelo 2012 and p.c.):


The first type of predicate never shows object marking and thus never projects a v with uφ features. In the second and third type of predicate uφ features must/ can be present on v. It is unclear, however, how type 1 can be distinguished (featurally) from the other two types, or, in other words, how type 1 forms a natural subset. object marking in Nyakyusa therefore appears to be an instance of nanoparametric variation: individual predicates have/do not have uφ features on v.

What underlies the distinction between the second and third type is equally unclear; alternatives suggested by anonymous reviewers include a potential semantic difference for psych vs. touch/motion verbs, and a phonological factor where the initial consonant of the verb stem or syllable structure might play a role in requiring object marking. However, at the moment this is only speculative and has to await further research on Nyakyusa object marking.

Even if the exact size of the parameter setting or the precise features involved are as yet unknown, it is clear that these languages distinguish different predicates, that is, different subtypes of little v, when it comes to the distribution of uφ features.<sup>15</sup> We thus need a further specification of subsets, arriving at the nano-level where certain predicates have a positive setting for the presence of φ features on v, indicated as v<sup>α</sup> in the adjusted hierarchy in (46).<sup>16</sup>

<sup>15</sup>Note that Sheehan (2014; 2017) proposes quite extensive subhierarchies for little v with respect to ergative alignment, but starting from a different logic underlying the shape of the parameter hierarchy.

<sup>16</sup>An alternative way of organising the hierarchy to make the typological implication fall out would be the following sequence of parameters (see also Bárány 2015): Is uφ present? > Is uφ present on T? > Is uφ present on v? > Is uφ present on Appl? Note, though, that this cannot capture the acquisitional path, and hence loses the motivation in the general cognitive principles of FE and IG.

### 3 From macroparameters to microparameters: A Bantu case study

Ciluba, Makhuwa Kinyarwanda, Bembe

This exploration of the hierarchy for uφ parameters has thus brought to light that what is thought to be the same phenomenon in the first instance might actually not be part of the same parameter hierarchy – concretely, the parameter for φ features on C was shown to be set independently of the other heads in the clause. The data also revealed an interesting implicational relation for φ features on argument-licensing heads, which can be captured in a parameter hierarchy that considers smaller and smaller subsets representing Bantu-internal parametric variation from the meso to the nano level.

## **4 Conclusions**

The different sizes of variation as proposed by Biberauer & Roberts (2015a), ranging from macro to nano, fit the morphosyntactic variation in the Bantu languages better than a simple "macro" or "micro". Importantly, this perspective encourages us to look seriously at syntactic variation from a featural perspective. The featural perspective is attractive with the Minimalist programme in mind, locating parametric variation in the features (on functional heads) in the lexicon. With the outlined parameter-setting algorithm motivated by third-factor principles (Biberauer 2017a,b; Biberauer & Roberts 2017) we have a promising model accounting for crosslinguistic variation.

Bantu-internal variation shows that all parameter types are actually attested, and individual languages vary as to the "grain" of their settings: what is micro

### Jenneke van der Wal

in one system could be nano in another, etc. This is predicted in the current approach: the "same" phenomenon surfaces in different sizes in different systems (cf. Biberauer & Roberts 2016; Ledgeway 2013).

The Bantu variation also illustrates that the setting of a parameter to a certain size does not necessarily correspond to geographical or genealogical macrovariation or microvariation. Whether languages or language families differ markedly from each other or not (macrovariation) or are more similar but show variable properties (microvariation) tends to go hand in hand with the size of the parameter setting (because of diachronic stability), but there is no one-to-one relation: a language can have a macro setting on a certain parameter hierarchy where the rest of the family has smaller settings, and the variation between otherwise similar languages can still be characterised as microvariation – as is the case for Bantu.

The crosslinguistic variation as seen in this paper thus stems from 1. whether a feature is present in a language at all; 2. on which (subset of) heads the feature is present; 3. the combination/interaction of different parameter settings. The current state of research focuses primarily on the first and second determinants, which necessarily precede the third aspect. Future research will hopefully shine light on the interaction of the various parameters and parameter hierarchies, especially since the "some" options in any hierarchy are formed by the interaction with features that potentially are part of their own separate hierarchy. This, however, requires further conceptual and empirical investigation.

## **Abbreviations**


### 3 From macroparameters to microparameters: A Bantu case study


Numbers refer to noun classes, or to persons when followed by sg or pl. High tones are marked by an acute accent, low tones are unmarked or marked by a grave accent.

## **Acknowledgements**

This research was funded by the European Research Council advanced grant no. 269752 "Rethinking comparative syntax". I am grateful to the members of the ReCoS team and especially Theresa Biberauer with whom this little project was started, and furthermore to the helpful comments of three anonymous reviewers. Any errors and misrepresentations are mine only.

## **References**


Jenneke van der Wal


3 From macroparameters to microparameters: A Bantu case study


### Jenneke van der Wal


3 From macroparameters to microparameters: A Bantu case study


Jenneke van der Wal


3 From macroparameters to microparameters: A Bantu case study


Jenneke van der Wal


# **Chapter 4**

# **Comparative syntax: An HPSG perspective**

## Robert D. Borsley

University of Essex and Bangor University

There has been little explicit discussion of comparative matters in the HPSG literature, but HPSG has a number of properties which make it relevant to comparative syntax. Firstly, it emphasizes detailed formal analyses, often incorporated into a computer implementation. This means that the framework provides firmer foundations than some other approaches for claims about individual languages and about language in general. Secondly, it stresses how little is really known about what is and is not possible in natural language syntax. Thirdly, it seeks to develop concrete analyses closely linked to the observable data, which keep the acquisition task as simple as possible and create as little need as possible for innate apparatus. These properties suggest that HPSG can make an important contribution to the comparative syntax.

## **1 Introduction**

In what ways are languages alike in their syntax? In what ways can they differ? Comparative syntax seeks to answer these questions and perhaps to explain the answers that it arrives at. It has been a major focus of mainstream generative grammar (MGG)<sup>1</sup> since the emergence of the principles and parameters framework in the early 80s, and it has been a central concern of Ian Roberts (see e.g. Roberts 1997; 2007). However, the questions that define the field of comparative

<sup>1</sup> I take this term from Culicover & Jackendoff (2005), who define it as "the line research most closely associated with Noam Chomsky" (fn. 1, p. 3). It refers to a variety of different but related approaches. Like Culicover & Jackendoff I do not regard "mainstream" as a synonym for "correct".

Robert D. Borsley. 2020. Comparative syntax: An HPSG perspective. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 61–90. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972834

### Robert D. Borsley

syntax are of interest not just to MGG but to any serious approach to syntax. In this paper, I will consider what the Head-Driven Phrase Structure Grammar (HPSG) framework can say about them. Although there has been work in HPSG on a variety of languages, there has not been much explicit discussion of comparative matters in the main HPSG literature. Typical papers say "here is a good way to deal with phenomenon P in language L" and not "here's an interesting way in which languages may differ". However, it is not too hard to spell out a view of comparative matters that is implicit in much HPSG work. Moreover, HPSG-based computational work has often been concerned with comparative issues, in particular with developing minimally different grammars for a variety of languages (see e.g. Müller 2015; Bender et al. 2010; Bender 2016), and this work is also of some relevance here. HPSG brings a number of ideas to the discussion of comparative syntax. One is a stress on the importance of firm empirical foundations in the form of detailed formal analyses. Another is an emphasis on how little we really know about what is and is not possible in natural language syntax. A third is an emphasis on the importance of developing concrete analyses which keep the acquisition task as simple as possible. I will discuss all of these in the following pages.

The paper is organized as follows. In §2, I look at the principles and parameters approach to comparative syntax and explain why proponents of HPSG are sceptical about it. Then in §3, I explain the main components of HPSG grammars: types, features, and constraints. In §4, I discuss the ways in which HPSG grammars may differ, and in §5, I pull together the main ideas about comparative syntax that I have introduced in the preceding sections. In §6 I conclude the paper.

## **2 Principles and parameters**

For MGG, the ways in which languages are alike and the ways in which they may differ are a reflection of an innate language faculty. The properties they share are the result of innate principles, while the ways in which they may differ are defined by innate parameters. This position has been hugely influential over the last 25 years. However, it seems fair to say that these ideas, especially the idea of innate parameters, have not been as successful as was hoped when they were first introduced in the early 1980s.<sup>2</sup>

Outsiders have always been sceptical about these ideas. Thus, Pollard & Sag (1994: 31), after considering the possibility of incorporating parameters into HPSG, comment as follows:

<sup>2</sup> See Newmeyer (2005) and Haspelmath (2008) for relevant discussion.

### 4 Comparative syntax: An HPSG perspective

In the absence of a list, however tentative, of posited parameters and their range of settings, together with a substantial, worked-out fragment for at least one language, a specification of the settings for that language, and a reasonably detailed account of how those settings account for the array of facts covered in the fragment, we are inclined to view parameter-based accounts of cross-linguistic variation as highly speculative.

More recently, linguists who are less obviously outsiders have come to similar conclusions. Thus, Newmeyer (2005: 75) writes as follows:

[…] empirical reality, as I see it, dictates that the hopeful vision of UG as providing a small number of principles each admitting of a small number of parameter settings is simply not workable. The variation that one finds among grammars is far too complex for such a vision to be realized.

At least one Minimalist has come to much the same conclusion. Boeckx (2011) suggests that:

some of the most deeply-embedded tenets of the Principles-and-Parameters approach, and in particular the idea of Parameter, have outlived their usefulness.

A major reason for scepticism about parameters is that estimates of how many there are seem to have steadily increased. Fodor (2001) considers that there might be just twenty parameters, so that acquiring a grammatical system is a matter of answering twenty questions. Newmeyer (2005: 44) remarks that "I have never seen any estimate of the number of binary-valued parameters needed to capture all of the possibilities of core grammar that exceeded a few dozen". However, Roberts & Holmberg (2005) comment that "[n]early all estimates of the number of parameters in the literature judge the correct figure to be in the region of 50– 100". Clearly, a hundred is a lot more than twenty. This is worrying. AsNewmeyer (2006: 6) observes,

it is an ABC of scientific investigation that if a theory is on the right track, then its overall complexity decreases with time as more and more problematic data fall within its scope. Just the opposite has happened with parametric theory. Year after year more new parameters are proposed, with no compensatory decrease in the number of previously proposed ones.

### Robert D. Borsley

The increasing numbers might not be a cause for concern if parameters were just seen as observations about how languages may vary, but if they are seen as part of an innate language faculty, it is worrying. It is just not clear how there could be so much that is innate. Moreover, a large number of innate parameters seems incompatible with the minimal conception of the language faculty that Chomsky has championed over the last decade or so.<sup>3</sup>

Scepticism about parameters is not a matter of saying that anything goes. It is also not a matter of rejecting any notion of an innate language faculty. After all, Chomsky argued for a language faculty for two decades before he formulated the idea of parameters, and there are more recent advocates of a language faculty who do not assume parameters, for example Culicover & Jackendoff (2005). Thus, one might reject the idea of parameters but still subscribe to the idea of an innate language faculty. However, neither evidence that there are universal properties of language nor evidence that variation is limited is necessarily evidence for an innate language faculty since there may be other explanations. Thus, Sag (1997: 478), echoing much earlier work, suggests that "… perhaps much of the nature of grammars can be explained in terms of general cognitive principles, rather than idiosyncratic assumptions about the nature of the human language faculty". In rather similar vein, Chomsky (2005: 9) advocates "…shifting the burden of explanation from the first factor, the genetic endowment, to the third factor, language-independent principles of data processing, structural architecture, and computational efficiency".

Probably most proponents of HPSG would remain agnostic about these matters. No doubt there are language universals and languages do not vary without limit, as Joos suggested. But most HPSG linguists would think that we do not have enough detailed formal analyses of enough phenomena in enough languages to have any firm conclusions about these matters. In the absence of such conclusions, it is not possible to say much about contributions of general cognitive principles and purely linguistic principles to grammatical phenomena.

## **3 The HPSG framework**

HPSG emerged in the mid 1980s, building in various ways on earlier work, and it has since been employed in theoretical and computational work on a variety of languages.<sup>4</sup> It is a monostratal, constraint-based approach to syntax. As a monostratal approach, it assumes that linguistic expressions have a single constituent

<sup>3</sup> For further discussion of parameters and the problems they face, see Newmeyer (2017).

<sup>4</sup>As a referee has pointed out to me, many of the properties of HPSG that I highlight here are also features of Lexical Functional Grammar.

### 4 Comparative syntax: An HPSG perspective

structure. This means that no constituent ever appears anywhere other than its superficial position and hence that it has nothing like the movement processes that are a feature of all versions of transformational grammar. The relations that are attributed to movement in transformational work are captured by constraints that require certain features to have the same value. For example, a raising sentence is one with a verb which has the same value for the feature subj(ect) as its complement. As a constraint-based approach, it assumes that grammars involve sets of constraints, and a linguistic expression is well-formed if and only if it conforms to all relevant constraints. There are no procedures modifying representations such as the Merge and Agree operations of Minimalism. For arguments in favour of such a declarative view of grammar, see e.g. Pullum & Scholz (2001), Postal (2003) and Sag & Wasow (2011; 2015).

HPSG is also a framework which places considerable emphasis on detailed formal analyses of phenomena. Thus, it is not uncommon to find lengthy appendices setting out formal analyses. See, for example, Sag's (1997) paper on English relative clauses and especially Ginzburg & Sag (2000), which has a 50 page appendix. One consequence of this, alluded to above, is that HPSG has had considerable influence in computational linguistics.

A further important feature of HPSG is that it avoids abstract analyses with tenuous links to the observable data. Phonologically empty elements are only assumed if there is compelling evidence for them.<sup>5</sup> Thus, the fact that some English subordinate clauses contain a complementizer is not seen as evidence that there is a phonologically empty complementizer in subordinate clauses in which no complementizer is visible. Similarly, overt elements are only assumed to have properties for which there is clear evidence. The fact that many languages have a case system of some kind or some form of subject-verb agreement does not mean that they all do. This feature of HPSG stems largely from considerations about acquisition. Every element or property which is postulated for which there is no clear evidence in the data increases the complexity of the acquisition task and hence necessitates more complex innate machinery. This suggests that such elements and properties should be avoided as much as possible. It has important implications both for the analysis of individual languages and for how we see differences between languages.

<sup>5</sup>There may be compelling evidence for some empty elements in some languages. Thus, Borsley (2009: Sec. 8) argues that Welsh has phonologically empty pronouns. For general discussion of empty elements, see Müller (2016: Sec. 19.2).

### Robert D. Borsley

For HPSG, a linguistic analysis is a system of types, features, and constraints.<sup>6</sup> Types provide a complex classification of linguistic objects, features identify their basic properties, and constraints impose further restrictions. The central focus of HPSG is signs. For Ginzburg & Sag (2000), the type *sign* has the subtypes *lexicalsign* and *phrase*, and *lexical-sign* has the subtypes*lexeme* and *word*. Thus, we have the following type hierarchy:

Both *lexeme* and *phrase* have a complex system of subtypes. In both cases, complex hierarchies mean that the framework is able to deal with broad, general facts, very idiosyncratic facts, and everything in between. I will say more about this below.

There are many other kinds of type. For example, there are types that are the value of fairly traditional features like person, number, gender, and case. A simple treatment of person might have the types *first*, *second*, and *third*, and a simple treatment of number the types *sing*(*ular*) and *plur*(*al*).<sup>7</sup> Unlike the types mentioned above, these are atomic types with no features. There are also types that provide the value of various less familiar features. For example, HPSG has a feature head, whose value is a *part-of-speech*, a type which indicates the part of speech of a sign and provides appropriate information, e.g. information about person, number, gender, and case in the case of nominal signs or finiteness in the case of verbal signs. Two other important features are subj(ect) and comp(lement)s, whose value is a list of *synsem* objects, combinations of syntactic and semantic information. The former, mentioned earlier, indicates what kind of subject a sign requires and the latter indicates what complements it takes. Obviously, there are plenty of opportunities here for languages to do things differently.

The type *lexeme* and its subtypes and the associated constraints are the core of the lexicon. In much HPSG work *lexeme* has two distinct sets of subtypes, one

<sup>6</sup>The related but slightly different framework, Sign-Based Construction Grammar, has a further major element, namely constructions. For SBCG signs are defined in terms of constraints on constructions, whereas standard HPSG has constraints applying directly to signs. SBCG is more complex in some respects but simpler in others. In particular, it has a simpler notion of sign and is able to dispense with a number of features and types which are assumed in HPSG. See Sag (2010; 2012) for discussion.

<sup>7</sup> In practice a more complex system of values may well be appropriate.

### 4 Comparative syntax: An HPSG perspective

dealing with part-of-speech information and one dealing with argument selection information. Here is a simple illustration based on Ginzburg & Sag (2000: 20):

Small capitals are used for the two dimensions of classification, and *v-lx*, *intrlx*, *s-rsg-lx*, and *srv-lx* abbreviate *verb-lexeme*, *intransitive-lexeme*, *subject-raisinglexeme*, and *subject-raising-verb-lexeme*, respectively. All these types will be subject to specific constraints. For example, *v-lx* will be subject to something like the following constraint:

$$(3)\quad \nu\text{-}l\text{x}\rightarrow\begin{bmatrix} \text{HEAD } \text{\textper}b\\ \text{SUBJ} & \langle \text{xp} \rangle \end{bmatrix}$$

This says that a verb lexeme has a verbal part of speech and requires a phrase of some kind as its subject. Similarly, we will have something like the following constraint for *s-rsg-lx*:

$$\begin{array}{rcl} \text{(4)} & \text{s-rsg-lx} \longrightarrow \begin{bmatrix} \text{susg} & \langle \square \rangle \\ \text{comps} & \langle \text{susg} \rangle \end{bmatrix} \end{array}$$

This says that a subject-raising-lexeme has a subject and a complement, and the subject is whatever the complement requires as a subject. Most of the properties of any lexeme will be inherited from its supertypes. Thus, very little information needs to be associated with specific lexemes in a system like this.

The lexicon is important for HPSG, and it has been the focus of much research. However, it is not as important as it is for Minimalism. In Minimalism, the syntax is just a few very general mechanisms – Merge, Agree, Copy – and how they operate is determined by the properties of lexical items. Hence, the lexicon is absolutely central. In HPSG, as explained below, the syntax is a complex system

Robert D. Borsley

of types and constraints. Hence the lexicon is rather less central than it is in Minimalism.

The type *phrase* and its subtypes and the associated constraints are central to the syntax of the language. It is widely assumed that type *phrase* has two distinct sets of subtypes, one dealing with headedness information and one dealing with clausality information. Here is a simple illustration:

*Head-fill-ph*, *interr-cl*, and *wh-interr-cl* are abbreviations for *head-filler-phrase*, *interrogative-clause*, and *wh-interrogative-clause*, respectively. Other subtypes of *headed-phrase* are *head-complement-phrase* (for combinations of a word and its complements) and *head-subject-phrase* (for combinations of a phrase and its subject), and other subtypes of *head-filler-phrase* include *wh-relative-clause*. Again, all the types will be subject to appropriate constraints. For example, *headedphrase* will be subject to a constraint requiring it to have a head daughter with which it shares certain properties. This system allows all sorts of generalizations to be captured. Properties that are shared by all phrases can be captured by a constraint on *phrase*, properties that are shared by all headed-phrases by a constraint on *headed-phrase*, properties that are shared by all head-filler-phrases by a constraint on *head-fill-ph*, and so on.

Among other things, constraints on the various phrasal types provide information about what daughters they have. However, they don't say anything about the order of the daughters. This is the province of a separate set of constraints. Obviously, this is an area in which languages may differ.

An HPSG syntactic analysis is quite complex, especially compared with Minimalism, for which, as we have noted, syntax is just a few very general mechanisms. However, it is not as complex as the base component of an *Aspects*-style grammar (Chomsky 1965) nor as the kind of grammar proposed within the earlier Generalized Phrase Structure Grammar (GPSG) framework (Gazdar et al. 1985)

### 4 Comparative syntax: An HPSG perspective

Both approaches involve many different rules for combinations of a head and its complement, a set of rules for VPs, a set for PPs, and so on. Most HPSG work has a single *head-complement-phrase* type with no subtypes. This raises the question: when do we need to postulate a phrasal type? There are, of course, various different kinds of head-complement-phrase, but there is no need for any subtypes. A verb-phrase is just a head-complement-phrase headed by a verb with certain properties stemming from its head, while a prepositional phrase is just a head-complement-phrase headed by a preposition, again with certain properties stemming from its head. We can say the following:

A phrasal type is necessary whenever some set of phrases have properties which do not follow either from the more general types which they instantiate or from the lexical items that they contain.

This might lead one to wonder whether a *wh-interrogative-clause* type is necessary. One point to emphasize here is that a *wh*-interrogative-clause is not just a head-filler-phrase with a *wh*-phrase as the filler. The *wh*-phrase must have the immediately containing clause as its scope. This is unlike the situation in languages with so-called partial *wh*-movement. Consider, for example, the following German example from McDaniel (1989).

(6) German

Was what glaubt believes Hans Hans [[ mit with wem whom ] Jakob Jakob jetzt now spricht speaks ]? 'What does Hans think Jacob is speaking to now?'

Here the *wh*-phrase is in the subordinate clause, but, as the translation makes clear, the scope of the *wh*-word *wem* is the whole sentence. It is also necessary to ensure that English *wh*-interrogatives have a pre-subject auxiliary if and only if it is main clause. It may be possible to capture these facts without postulating a *wh-interrogative-clause* type, but it is not easy.

At least this is not easy if phonologically empty elements are not freely available. If such elements are freely available, it may well be possible to attribute the facts to the properties of a phonologically empty head. This is essentially the approach which is taken in Minimalism, in which head-filler-phrases involve structures of the following form, where X is C(omplementizer) or one of the elements that replaces it in work stemming from Rizzi (1997), e.g. Force, Top(ic), Foc(us).

Robert D. Borsley

$$\begin{array}{rcl} \text{(7)} & \text{XP} \\ & \stackrel{\text{\tiny XP}}{\text{YP}} \\ & \stackrel{\text{\tiny X'}}{\text{X}} \\ & \text{X} \end{array} $$

The idea seems to be that the properties of X ensure that the specifier YP and the complement ZP have the right properties. However, this idea never seems to be developed in any detail. A detailed development would involve precise lexical descriptions for the various empty heads. The sort of thing that is necessary was developed in some early HPSG work. Pollard & Sag (1994: Ch. 5) outlined an approach to English relative clauses involving a number of empty heads (an approach which was abandoned in Sag 1997). One of these heads has the following description:

$$\text{(A)} \quad \begin{bmatrix} \text{ } & \begin{bmatrix} \text{ } & \begin{bmatrix} \text{HEAD } \begin{bmatrix} \text{ } & \text{IDM } \text{N} \end{bmatrix} \begin{bmatrix} \text{IND } \text{N} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \end{bmatrix} \end{bmatrix} \\\\ \text{(A)} \quad \begin{bmatrix} \text{DEC } \text{E} \end{bmatrix} \begin{pmatrix} \begin{bmatrix} \text{LOC } \text{E} \end{bmatrix} \begin{bmatrix} \text{DEC } \text{E} \end{bmatrix} \begin{bmatrix} \text{LEV } \end{bmatrix} \end{bmatrix} \\\\ \text{(SUBCAT } \begin{cases} \begin{bmatrix} \text{LOC } \text{E} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \text{E} \end{bmatrix} \begin{bmatrix} \text{REL } \begin{bmatrix} \text{LI } \end{bmatrix} \end{bmatrix} \end{bmatrix} \end{bmatrix} \end{bmatrix} \begin{bmatrix} \text{D } \text{E} \end{bmatrix} \\\\ \text{(SOTENT } \begin{bmatrix} \text{IND } \text{E} \text{E} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \text{E} \end{bmatrix} \begin{bmatrix} \text{IND } \text{E} \text{E} \end{bmatrix} \end{bmatrix} \end{bmatrix} \end{bmatrix} \begin{bmatrix} \text{D } \text{E} \end{bmatrix} } \begin{bmatrix} \text{MLP } \text{E} \text{E} \end{bmatrix} \end{bmatrix} \end{bmatrix} \begin{bmatrix} \text{MLP } \text{E} \text{E} \end{bmatrix} \tag{2.23}$$

This interacts with certain phrase types to give a structure like (7). It is complex, but each component of it has a purpose. The mod feature indicates that the maximal projection of this element modifies an N′ . The subcat feature indicates that it combines with a specifier containing a relative pronoun and a complement which is a finite clause with no complementizer but a non-empty slash feature ensuring that it contains a gap.<sup>8</sup> This feature also ensures that the specifier has the properties in the value of slash. The content feature ensures that the content of this element brings together the content of the modified N′ and the relative clause. Various principles of HPSG ensure that the combination of N′ and relative clause has the content of the empty head.<sup>9</sup> As noted above, this approach has been abandoned, but it gives some idea of what is involved in giving an explicit analysis of the kind of empty head that is central to the Minimalist approach to head-filler-phrases. It may be that Minimalist empty heads will have

<sup>8</sup>The subcat feature does work that is done by separate subj and comps features in later work. slash does the work that is done in MGG by A′ -movement. For arguments that the slash mechanism provides a better account of the phenomena, see Borsley (2012).

<sup>9</sup>The to-bind features ensure that the rel and slash features do not appear any higher in the tree than they should.

### 4 Comparative syntax: An HPSG perspective

simpler descriptions, but until such descriptions have been developed, we cannot really know.

Within Minimalism it is not just head-filler-phrases whose properties have to be derived in some way from a typically empty head. English clauses without an auxiliary have an empty T head, and English nominal constituents without a visible determiner have an empty D head. Thus, empty heads of various kinds are central to Minimalism. This is a reflection of the fact noted earlier that the syntax for Minimalism is just a few very general mechanisms. Minimalism is a bit like a version of HPSG with just two phrase types, an External Merge type and an Internal Merge type.<sup>10</sup> It follows that the real work must be done by lexical elements and often by empty lexical elements. Oddly, however, very little attention has been paid to the properties of these elements.<sup>11</sup>

If empty elements are only postulated when there is compelling evidence for them, there is no possibility of deriving the properties of different phrase types from various invisible heads. Hence, a fairly complex syntax is more or less inevitable. However, this need not be a problem for acquisition if the analysis is a fairly direct reflection of the observable data, as it is in HPSG.

As we have noted, a typical HPSG analysis will have a number of other subtypes of *head-filler-phrase*. Consider the following examples:


The bracketed material in (9) is a *wh*-relative, (10) is a *wh*-exclamative, and (11) is what has often been called a comparative correlative, a construction whose component clauses have been called *the*-clauses, e.g. in Borsley (2011). We have three types of head-filler phrases each with various distinctive properties. *Wh*relatives may contain *who* and *which* but not *what*. *Wh*-exclamatives may only contain *what a*(*n*) or *how*. Neither allows an auxiliary before the subject. Finally, *the*-clauses must contain *the* and a comparative word. The second clause but not the first may contain a pre-subject auxiliary:

	- b. \* The more do I read, the more I understand.

<sup>10</sup>For further discussion of the relation between the two approaches, see Müller (2013).

<sup>11</sup>Newmeyer (2005: 95, fn. 9) comments that "… in no framework ever proposed by Chomsky has the lexicon been as important as it is in the MP [Minimalist program]. Yet in no framework proposed by Chomsky have the properties of the lexicon been as poorly investigated."

Robert D. Borsley

For HPSG, these facts can be handled by constraints on three additional subtypes: *wh-relative-clause*, *wh-exclamative-clause*, and *the-clause*. See Ginzburg & Sag (2000) and Sag (2010).

English also has relative clauses with no visible relative pronoun. One might propose that such relative clauses have a phonologically empty relative pronoun. But, as we have noted, HPSG only assumes such elements if there is compelling evidence for them. In the absence of clear evidence for such an element, this is just an ad hoc way of minimizing differences between constructions. It is not difficult to provide an analysis which does not involve an empty element. For HPSG, as indicated earlier, relative clauses have a feature mod, whose value indicates what type of nominal phrase they modify. In a *wh*-relative clause, the value of mod is coindexed with the relative pronoun, as in (13):

The value of slash matches the filler and hence has the same index.<sup>12</sup> In a non*wh*-relative clause, the value of mod is coindexed directly with the value of slash, as in (14):

This just requires a type *non-wh-relative-clause* with an appropriate constraint. See Sag (1997) for discussion.

<sup>12</sup>In a more complex example such as the following, where the relative pronoun is just part of the filler, the value of slash and the relative pronoun will have different indices:

<sup>(</sup>i) whose brother I talked to

### 4 Comparative syntax: An HPSG perspective

## **4 HPSG and language variety**

An HPSG linguistic description involves types, features, and constraints, and languages may differ in any of these areas. Some types, features, and constraints will no doubt be universal, but others will be language-specific. The more general types such as *sign*, *lexical-sign*, *word*, *lexeme*, and *phrase* will probably occur in all languages with the same features, but many others are likely to be languagespecific or to have language-specific features.

The types that are the value of various traditional features will differ from language to language for obvious reasons. Languages differ in how many genders and cases they have. Therefore, the features gender and case will differ in what types they have as possible values. Languages may also differ in whether or not they have these features. Only some languages have grammatical gender and only some languages show morphological case. Of course, it is possible to assume an abstract notion of case present in languages whether or not they have morphological case, but this complicates the acquisition task and necessitates more complex innate machinery than would otherwise be needed. It is probably not a position that would find favour outside MGG.<sup>13</sup>

A question that arises here is whether languages have the same gender and case feature if they have very different systems of values. Does a language with two genders have the same gender feature as one with ten? Probably most researchers would think that they do, but there is room for debate here. Of course, questions like this are not peculiar to HPSG but arise in any theoretical framework.

Within HPSG, whether or not a language has case is first and foremost a question of whether the type *noun* has case among its features. But there is another question here: does the type *adj* have case among its features? In some languages that have morphological case it is clearly a property of adjectives as well as nouns. Consider e.g. German or Arabic. But in other languages with morphological case, it does not extend to adjectives. The North-East Caucasian language Archi is a relevant example (see Bond et al. 2016 for discussion). Similar issues arise with gender. If a language has gender, then the type *noun* has gender among its features, but it may or not be a feature of other types such as *adj* or *verb*.

What about other features, e.g. the head feature? This will probably have a large number of values (but not so many as it would have within Minimalism, where numerous "functional" parts of speech have been postulated, e.g. Force,

<sup>13</sup>An abstract notion of case (or Case) played an important role in the government and binding framework, but it seems to be of little importance within Minimalism and it has not been adopted outside MGG.

### Robert D. Borsley

Top(ic), and Foc(us) mentioned earlier). It is likely, however, that there will be some variation from language to language. Of course, just as there are questions about whether different languages can have the same gender and case features, so there are questions about whether they can have the same *noun*, *verb* and *adjective* types. Haspelmath (2010) thinks not. However, most HPSG linguists seem to assume they can, and this view is defended in Müller (2015: Sec. 2.2).

The questions that we have just highlighted arise in any theoretical framework. However, it is possible to sidestep them in a framework that does not emphasize formal analyses. HPSG with its emphasis on detailed formal analysis makes this more or less impossible.

The lexicon is obviously a major area in which languages differ. For Minimalism it is the only area in which differences may reside (a position often referred to as the Borer–Chomsky conjecture). This is an automatic consequence of the fact, highlighted earlier, that all the real work is done by lexical entries within Minimalism. This is not the case within HPSG given the important role of the system of phrasal types and associated constraints. However, for HPSG, many differences between languages are a lexical matter.

Most obviously, the same meaning will generally be associated with different phonological properties in different languages. English has *dog* where Welsh has *ci* and Polish has *pies*. Clearly, however, there can be other differences. A meaning may be associated with different head values in different languages. Thus, for example, the Welsh counterpart of the modal verb *must* is the noun *rhaid* "necessity", as in (15).

(15) Welsh Rhaid necessity i to mi me adael. leave.inf 'I must leave.'

Clearly, such contrasts are common. The same meaning may also have different selectional properties in different languages. It is clear that the selectional properties of a word are predictable to a considerable extent from its semantics. However, there is quite a lot of room for variation. Where one language has an NP with one case, another language may have an NP with a different case, or a PP. Similarly, where one language has a finite clause, another may have a non-finite clause, or some kind of nominalized clause. Within HPSG, what case subjects have is also commonly seen as a matter of selection. In some languages, all subjects or all subjects of finite verbs may have nominative case, but in other languages there are other possibilities.

Turning to syntax, we emphasized above that HPSG only postulates empty elements when there is compelling evidence for them. This has obvious implications for comparisons between languages. If empty elements are not freely available, there is no possibility of saying that languages are much the same but look different because elements that are overt in one are empty in others. It follows that we should expect substantial differences between languages in this area.

The central question here is: how far can languages vary in the phrasal types that they employ and the constraints to which they are subject? Probably all languages will have the type *headed-phrase* and *head-complement-phrase* as one of its subtypes. Perhaps they will also have the types *head-subject-phrase* and *head-filler-phrase*. But this may not be the case. Moreover, if two languages have the same type, it may well have different subtypes from language to language.

As noted above, it may be that all languages will have the type *head-fillerphrase*. But it is clear that languages will differ in what subtypes of *head-fillerphrase* they have. A *wh*-in-situ language will not have *wh-interrogative-clause* among the subtypes of *head-filler-phrase*. Since *wh*-interrogatives have the same structure as ordinary clauses in such languages, they will probably have a type *wh-interrogative-clause* which is a subtype of *head-subject-phrase*, giving the following situation:

(16) *wh-int-cl hd-subj-ph inter-cl*

One might wonder here whether phrasal types that have different supertypes (and are subject to different constraints) can really be viewed as the same type. I will not try to decide this question.

As we noted above, another subtype of *head-filler-phrase* in English is *whrelative-clause*. It seems, however, that most languages have relative clauses with no sign of a fronted relative pronoun. One might propose that relative clauses in such languages have a phonologically empty relative pronoun. But, as emphasized above, this is not a move that would find favour in HPSG. In the absence of any concrete evidence for such an element, it is just an ad hoc way of minimizing differences between languages. Thus, whereas English has both a *whrelative-clause* type and a *non-wh-relative-clause* type, many languages seem to just have the latter.

As also noted earlier, another subtype of *head-filler-phrase* is required to accommodate the two clauses in comparative correlatives such as the following:

### Robert D. Borsley

### (17) The more I read, the more I understand.

Other languages have broadly similar constructions.<sup>14</sup> Consider e.g. French and Spanish:

(18) French

Plus more je I lis, read plus more je I comprend. understand 'The more I read, the more I understand.'

(19) Spanish

Cuanto how.much más more leo, I.read (tanto) that.much más more entiendo. I.understand 'The more I read, the more I understand.'

In the French construction, there is no counterpart of *the*, while the Spanish construction has two different elements, *cuanto* 'how-much' and *tanto* 'that-much', the latter being optional. Maybe both languages will have the same subtype of *head-filler-phrase* (though a different name might be appropriate) but the subtype will be subject to somewhat different constraints. In some languages, the second clause need not be a head-filler-phrase. One such language is Dutch, with examples like the following:

Des the.gen te te meer more je you leest, read je you begrijpt understand des the.gen te te minder. less 'The more you read the more you understand.'

Thus, broadly similar constructions may differ in important ways and pose various analytic challenges.<sup>15</sup>

As noted above, the type *headed-phrase* has a number of subtypes. In addition to those mentioned, there is a *head-adjunct-phrase* type required for adjective and nominal combinations such as *old men* and verb-phrase and adverb combinations such as *walk slowly*. It may be that another subtype is necessary for verb-initial clauses such as (21).

<sup>(20)</sup> Dutch

<sup>14</sup>This is noted by den Dikken (2005: 498), who claims that the construction is "analyzable in keeping with the principles and parameters of UG". However, he does not provide an analysis. See Abeillé & Borsley (2008) for critical discussion.

<sup>15</sup>For further discussion and analyses of the French and Spanish constructions, see Abeillé et al. (2006); Abeillé & Borsley (2008).

### 4 Comparative syntax: An HPSG perspective

### (21) Is Kim a linguist?

HPSG rejects the view that all branching is binary and generally assumes a ternary branching analysis for such clauses.<sup>16</sup> An obvious approach is one in which both the subject and the complement are sisters of the verb, as in the following structure:

This approach requires an additional subtype of *headed-phrase*, which can be called *head-subject-complement-phrase*, with an appropriate constraint. But there is an alternative approach to verb-initial clauses, in which the verb takes two complements and no subject, giving a structure like the following:

This is an ordinary head–complement structure, but it requires special lexical descriptions for auxiliary verbs. These can be derived from the standard lexical descriptions by a lexical rule. The first of these approaches is adopted in Ginzburg & Sag (2000: 36), while the second approach is assumed in Sag et al. (2003: 410). One possibility is that the two approaches are relevant to different languages. Thus, Borsley (1995) argues that the first approach is right for verb-initial clauses in Syrian Arabic, while the second is appropriate for verb-initial clauses in Welsh.

<sup>16</sup>The arguments for the binary branching restriction have never been very persuasive, see e.g. Culicover & Jackendoff (2005: 112–116).

### Robert D. Borsley

One further point to note here is that a structure in which both the subject and the complement (or complements) are sisters of the verb is potentially relevant not just to clauses in which verb and complement(s) are separated by the subject but also to clauses in which they are adjacent. That is, there may be SVO or SOV clauses in which there is a flat structure and no VP. Thus, Borsley (2016) argues that such an analysis is appropriate for SOV clauses in Archi. On this analysis, (24) has the schematic analysis in (25).

(24) Archi

zari 1sg.erg noˤš horse.iii[sg.abs] darcʹ-li-r-ši post-obl.sg-cont-all e‹b›tʹni. ‹iii.sg›.tie.pfv 'I tied the horse to the post.'

e‹b›tʹni darcʹ-li-r-ši noˤš zari

NP [case abs]

Thus, the fact that V and O are normally adjacent in some language does not necessarily mean that they form a VP constituent.

NP [case obl] V

A more general point that we should make here is that it is important not to assume too quickly that something that looks rather like an English realization of a specific phrase type is just another realization of that type. For HPSG, English subject-initial clauses are realizations of a *head-subject-phrase* type. Arabic also has subject-initial clauses, e.g. the following:

(26) Arabic T-tullaab-u the-students-nom qaabaluu met.3pl.m /\*qaabala met.3sg.m Aħmad-a. Ahmad-acc 'The students met Ahmad.'

One might assume that these are head-subject-phrases. However, another possibility is that they are verb-initial clauses with an initial NP topic and hence headfiller-phrases. This might seem dubious initially. The verb in a subject-initial clause shows full agreement for person, gender, and number. The situation is different in verb-initial clauses, as the following shows:

4 Comparative syntax: An HPSG perspective

(27) Arabic qaabala met.3sg.m / \*qaabaluu met.3pl.m T-tullaab-u the-students-nom Aħmad-a. Ahmad-acc 'The students met Ahmad.'

Here, we have partial agreement, agreement for person and gender but not number. This might be seen as evidence against the idea that subject-initial clauses are clauses with an initial topic. Consider, however, an example with an initial topic interpreted as subject of a subordinate clause:

(28) Arabic

T-tullaab-u the-students-nom ʔiqtaraħtu suggested.1sg.m [ ʔan that yušaarikuu participate.3pl.m / \*yušaarika participate.3sg.m fii in l-musaabaqat-i the-competition-gen ]. 'The students I suggested participate in the competition.'

The complementizer *ʔan* only introduces verb-initial clauses. Hence the subordinate clause here is a verb-initial clause, but it shows full agreement. This seems surprising. However, the problem disappears if we assume that the clause has a null pronominal subject coindexed with a preceding topic. Null subject sentences, which I assume have a null pronominal subject, show full agreement. Thus, the following can only have the meaning indicated and cannot mean that they met Ahmad:

(29) Arabic

laqad indeed qaabala met.3sg.m Aħmad-a. Ahmad-acc 'He met Ahmad.'

Essentially the same analysis can be applied to (26). That is, it too can be analysed as involving an initial topic coindexed with a null pronominal subject. If this is right, (26) is not a head-subject-phrase but a head-filler-phrase.<sup>17</sup> Maybe Arabic has some other kinds of head-subject-phrase or maybe it has no head-subjectphrases at all.

We should now say something about word order. For HPSG, as for some other frameworks, some word order differences between languages are not very important. We noted earlier that constraints on the various phrasal types provide information about what daughters they have, but say nothing about the order of the

<sup>17</sup>This argument is taken from Alotaibi & Borsley (2013).

### Robert D. Borsley

daughters, which is the province of a separate set of constraints. It follows that head-initial and head-final languages may have head-complement-phrases that are identical apart from word order. This contrasts with the situation in Kayne's (1994) antisymmetry version of MGG in which complement-head order is the product of a movement process and hence more complex than head-complement order. The HPSG position is more like that of versions of MGG that assume a directionality parameter. However, unlike such approaches, HPSG does not assume that a language will linearize all head-complement structures in the same way. Hence, there is no problem with a language like Finnish, which has verbobject order but postpositions, or a language like Persian, which has object-verb order but prepositions. Such languages will have two different linear precedence constraints, while languages which order all head-complement structures in the same way will have just one. Hence the latter are simpler in this area, and this makes it unsurprising that they are more common.<sup>18</sup>

The fact just highlighted means that SVO and SOV languages may have VPs licensed by the same *head-complement-phrase* type. VSO languages are different if they have either of the analyses in (22) and (23). (On the analysis in (23) the clause is a head-complement-phrase but it is not an ordinary VP.) However, there is an alternative approach which might be taken to VSO clauses. Much work in HPSG has proposed that linear order is a reflection not of the constituent structure of an expression but of a separate system of order domains (see Reape 1992; Müller 1996; Kathol 2000). Within this approach, the constituent structure of an expression is encoded as the value of a dtrs (daughters) feature and the order domain as the value of a dom(ain) feature. Adopting it, one might propose that the Welsh VSO sentence in (30) has the schematic analysis in (31).

(30) Welsh

Gwelodd see.pst.3sg Emrys Emrys y the ddraig. dragon 'Emrys saw the dragon.'

(31) [ synsem *S* dtrs ⟨[*Emrys*]*,* [*gwelodd y ddraig*]⟩ dom ⟨[*gwelodd*]*,* [*Emrys*]*,* [*y ddraig*]⟩ ]

On this analysis Welsh has finite VPs just like English. One could propose essentially the same analysis for verb-initial clauses in a language in which the

<sup>18</sup>Essentially this point was made by Fodor & Crain (1990) in a discussion focusing on the earlier GPSG framework.

### 4 Comparative syntax: An HPSG perspective

existence of finite VPs is uncontroversial, e.g. English. In Borsley (2006), I argue against an analysis of this kind for Welsh and in favour of an analysis of the kind in (23). It could be, however, that the approach in (31) is appropriate for other VSO languages or for verb-initial clauses in some language of other types.

Even if order domains are not appropriate for Welsh VSO clauses, they provide a plausible approach to various other phenomena. For example, they might be used to provide an account of extraposed relative clauses, such as (32), which might have the analysis in (33).

(32) A man came in who looked like Chomsky.

(33) [ synsem *S* dtrs ⟨[*a man who looked like Chomsky*]*,* [*came in*]⟩ dom ⟨[*a man*]*,* [*came in*]*,* [*who looked like Chomsky*]⟩ ]

Alternatively, however, one might assume that such examples are rather like head-filler-phrases but with the filler constituent on the right.

Order domains seem most plausible as an approach to the sorts of discontinuity that are found in so-called nonconfigurational languages such as Warlpiri.<sup>19</sup> However, they may well have a role to play in more familiar languages. Exactly how much of a role they play in syntax is an unresolved matter.

The preceding remarks illustrate the fact that there are often a number of plausible approaches to some syntactic phenomenon. This means that it is not easy to know what the right analysis is and that it is hard to be confident that one has the right analysis for any phenomenon. Deciding on an analysis is somewhat easier if you subscribe to a theoretical framework which limits the range of possible analyses, e.g. by excluding more than binary branching or by insisting with Kayne (1994) that there is a universal specifier-head-complement order. The first restriction is generally accepted within Minimalism, and the second quite widely accepted. Outside Minimalism, however, the view is that there is little motivation for them. Whatever framework one subscribes to, there are many unresolved issues, even in a language like English, which has been studied by numerous syntacticians over many decades. Naturally, there are many more such issues in languages which have a lot less attention. All this means that there is little basis for strong claims about language universals or the extent to which languages may vary.

<sup>19</sup>See e.g. Donohue & Sag (1999).

Robert D. Borsley

## **5 Further discussion**

The previous section ended on what might be seen as a negative note. It seems to me that this is a realistic assessment, but some researchers have painted a much rosier picture. Not so long ago, Baker (2001) commented that: "We are approaching the stage where we can imagine producing the complete list of linguistic parameters, just as Mendeleyev produced the (virtually) complete list of natural chemical elements" (Baker 2001: 50). It is not clear that many would share this optimism now. In any event, I do not see how this could be justified. We could only be confident about any set of proposals about parameters if we had detailed formal analyses for a wide range of languages employing them. There are of course proposals about many phenomena in many languages, but the detail and precision is generally lacking. Thus, Culicover & Jackendoff (2005: 535) comment that "much of the fine detail of traditional constructions has ceased to garner attention".

The limited nature of our knowledge is sometimes recognized within MGG. Thus, Chomsky (1995: 382, fn. 22) remarks that: "… we still have no good phrase structure theory for such simple matters as attributive adjectives, relative clauses, and adjuncts of different types". No doubt there has been some progress since 1995, but there are clearly still many unresolved issues about these phenomena both in English and in other languages. Obviously, other languages are very important here. The vast majority of languages have had a fraction of the attention that has been lavished on English. If other languages were broadly similar to English, this might not matter, but it is hard to deny that there are major differences. I highlighted a number of differences in the previous section, but it may be that languages can differ even more fundamentally from English. Koenig & Michelson (2014) argue that Oneida has no standard syntactic features. In similar vein, Gil (2005; 2009) argues that Riau Indonesian has no parts of speech, almost no function words, and virtually no morphology.<sup>20</sup> It is possible that someone will be able to show that these languages are less different from familiar languages than they appear, but at present they suggest that language variety is rather greater than is often suggested. We may eventually have a firm basis for claims about language universals and the extent to which languages may vary, but currently this seems a long way off. So at least it seems to most people within HPSG.

Another feature of HPSG, alluded to above, which is very relevant in the present context is the emphasis on the importance of firm empirical foundations in the form of detailed formal analyses of the kind advocated by Chomsky in *Syntactic structures*. Whereas MGG typically offers sketches of analyses which might

<sup>20</sup>But see Yoder (2010) some critical discussion.

### 4 Comparative syntax: An HPSG perspective

be fleshed out one day, HPSG commonly provides detailed analyses which can be set out in an appendix. As noted above, Ginzburg & Sag (2000), which sets out its analysis of English interrogatives in a 50 page appendix, is a notable example. Arguably one can only be fully confident that a complex analysis works if it is incorporated into a computer implementation. Hence, computer implementations of HPSG analyses are quite common. Particularly important here is the Core-Gram project reported in Müller (2015), which seeks to develop computational grammars for a diverse range of languages. Among other things, this permits a fairly precise measure of how similar or how different grammars are, in terms of shared constraints or shared lines of code. Analyses that are not implemented or are only partly implemented can be very valuable, but it seems likely that implemented analyses will be increasingly important in syntax, and that includes comparative syntax.

A further important feature of HPSG, highlighted above, is its avoidance of abstract analyses with elements or properties for which there is no clear evidence in the data. There may be real evidence for such elements and properties, but research in HPSG suggests that they are generally unnecessary. For example Ginzburg & Sag (2000) can be seen among other things as a demonstration that English interrogatives do not require either movement processes or abstract structures, and much the same can be said of Sag (1997) and English relative clauses. As was emphasized above, grammars that are quite closely related to the observable data pose less of a problem for acquisition than grammars that are more abstract and hence create less need for some innate apparatus. This is surely something that anyone should view as a good thing.<sup>21</sup>

As noted above, this outlook on grammar construction entails that the fact that many languages have some element or property should not be seen as evidence that they all do. Many languages have case and many languages have agreement, but it does not follow that they all do. In much the same way, many birds fly, but it does not follow that they all do, even those such as ostriches and penguins which never seem to get off the ground. As Müller (2015: 25) puts it, "grammars should be motivated on a language-specific basis." Does this mean that other

<sup>21</sup>One might think that the acquisition task is fairly simple if languages have essentially the same structures differing only in what is and what is not visible. But this seems doubtful. As Fodor (2001: 765) puts it, "It is clear now that even if the structural scaffolding of sentences is everywhere fixed and the same, any particular sentence may be highly ambiguous with respect to how its words are attached to that scaffolding." Essentially, the more complex the structure of sentences is and the more invisible material it may contain, the harder it is for the learner to determine where anything is. As Fodor (2001: 763) comments, on this view, "natural language design is extremely cruel to children".

### Robert D. Borsley

languages are irrelevant when one is investigating a specific language? Clearly not. As Müller also puts it,

In situations where more than one analysis would be compatible with a given dataset for language X, the evidence from language Y with similar constructs is most welcome and can be used as evidence in favor of one of the two analyses for language X. (Müller 2015: 43)

In practice, any linguist working on a new language will use apparently similar phenomena in other languages as a starting point. It is important, however, to recognize that apparently similar phenomena may turn out on careful investigation to be significantly different. I made this point in the last section in connection with subject-initial clauses in Modern Standard Arabic. Arabic comparatives provide a rather different illustration.

Like many languages, Modern Standard Arabic has simple comparatives with a comparative form of an adjective and complex comparatives with two separate elements:

### (34) Arabic


Superficially, these examples are much like their English translations and like simple and complex comparatives in many other languages. However, as the gloss of (34b) makes clear, *thakaʔ-an* is not an adjective like *intelligent*, but what can be called an adjectival noun (with accusative case). This might seem like a minor, unimportant difference. However, there is evidence that it is an important matter, reflecting the fact that Arabic complex comparatives are a quite different construction from the complex comparatives of many other languages. The most important evidence comes from the fact that the construction can contain not just adjectival nouns but also ordinary nouns:

(35) Arabic

ʔanaa 1sg.m ʔakthar-u more-nom maal-an money.acc min from ali-in Ali-gen 'I have more money than Ali.'

### 4 Comparative syntax: An HPSG perspective

It is fairly clear that (34b) involves the same comparative construction as (35). To reflect this, it could be translated as "I have more intelligence than Ali". The comparative construction in (34b) and (35) is quite like what is called the adjectival construct construction, illustrated in (36).

(36) Arabic

'anta you azīm-u great-nom l-hazz-I the-fortune-gen 'You have great luck.', 'You are very lucky.'

The nominal in the adjectival construct is genitive and definite whereas that in the comparative construction is accusative and indefinite. However, in both cases, we have an adjective with an extra nominal complement, and in both, we have what can be called a possessive interpretation. Thus, the construction in (34b) is very different from its counterpart in English and other languages.<sup>22</sup>

Thus, phenomena that look familiar may turn out to be rather exotic. Of course it may also turn out that what look like unfamiliar phenomena are not so very different from phenomena one is familiar with. All this just means that syntax is complex and that it is not easy to get a clear picture of the syntax of any language.

# **6 Concluding remarks**

I have argued in the preceding pages that HPSG is a framework that can make a major contribution to the comparative syntax. It has a number of features that are important here. The first is its emphasis on detailed formal analyses of the kind envisaged in *Syntactic structures*, often incorporated into a computer implementation. This means that the framework provides firmer foundations than some other approaches for claims about individual languages and ultimately about language in general. Secondly, it is cautious about advancing strong claims about the universal properties of language and the extent of linguistic variation. Some may feel that bold conjectures act as a stimulus for research, but it is not clear that they are any more effective in this regard than sober and cautious assessments of what is and is not known. Finally, there is the avoidance of abstract analyses with tenuous links to the observable data. As I have emphasized, this makes the acquisition problem less difficult than it would be if grammars were more abstract and hence creates less need for innate apparatus. For these reasons, HPSG has a lot to offer for anyone interested in comparative syntax and looking for a suitable theoretical framework.

<sup>22</sup>See Alsulami et al. (2017) for detailed discussion of the facts.

### Robert D. Borsley

## **Abbreviations**


## **Acknowledgements**

I am grateful to Stefan Müller, Fritz Newmeyer, Maggie Tallerman and two anonymous referees for their comments on an earlier version of this paper. Of course I alone am responsible for what appears here.

# **References**


4 Comparative syntax: An HPSG perspective


Robert D. Borsley


4 Comparative syntax: An HPSG perspective


Robert D. Borsley


# **Chapter 5**

# **Some (new) thoughts on grammaticalization: Complementizers**

## Anna Roussou

University of Patras

Grammaticalization creates new grammatical exponents out of existing (lexical) ones. The standard assumption is that this gives rise to categorial reanalysis and lexical splits. The present paper argues that categorial reanalysis may not be so pervasive and that lexical splits may also be epiphenomenal. The set of empirical data involves the development of (Indo-European) complementizers out of pronouns. The main claim is that the innovative element (the complementizer) retains its nominal feature; thus strictly speaking, there is no categorial reanalysis, but a change in function and selectional requirements, allowing for an IP complement as well. As a complementizer, the pronoun is semantically weakened (the nominal core), and phonologically reduced (no prosodic unit). In its pronominal use, it may bind a variable (interrogative/relative) and defines a prosodic unit. What is understood as a lexical split then reduces to a case of different selectional requirements, followed by different logical form (LF) and phonetic form (PF) effects.

# **1 An overview**

According to Meillet (1958 [1912]), the two basic mechanisms for language change are *grammaticalization* and *analogy*. While grammaticalization creates new grammatical material out of "autonomous" words, analogy develops new paradigms by formal resemblance to existing ones. Grammaticalization has received great attention in the literature (for an overview see Narrog & Heine 2011), raising the question whether it is a mechanism of change or an epiphenomenon. The answers provided mainly depend on the theoretical framework adopted and the

Anna Roussou. 2020. Some (new) thoughts on grammaticalization: Complementizers. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 91–111. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972836

### Anna Roussou

view on how "grammar" is to be defined. Thus, in functional approaches, grammaticalization is a mechanism that leads to the formation of "grammar" (or of grammatical structures), while in formal approaches, grammaticalization is either denied altogether (Newmeyer 1998; Lightfoot 1998; 2006; Janda 2001; Joseph 2011) or considered an epiphenomenon (Roberts & Roussou 2003; van Gelderen 2004).

Despite the different views on the topic, it is generally accepted that grammaticalization (or whatever it reduces to) has a visible effect cross-linguistically. There are common tendencies and patterns in how the lexical to functional change may take place (see Heine & Kuteva 2002 for a wide range of examples). For example, complementizers can have their origin in pronouns (interrogatives, demonstratives, relatives), verbs (*say*, *like*, etc.), nouns (*thing*, *matter*), or prepositions (allative). Within functional/typological perspectives, grammaticalization is primarily viewed as a "semantic" process where concepts are transferred into constructions (for an overview, see Hopper & Traugott 2003). Once the relevant elements are used as grammatical markers, they show semantic "bleaching" (loss of primary meaning) and phonological reduction. According to Traugott (2010), grammaticalization is not only a matter of reduction, but also of pragmatic expansion in terms of content. At the same time, a typological account shows that some lexical items are more amenable than others to give rise to grammatical markers, although this is not deterministic in any respect. Still, this observation points towards an interesting direction with respect to how the lexicon interacts with (morpho-)syntax.

Within the formal approach to grammaticalization, the basic assumption is that it is an epiphenomenon. More precisely, grammaticalization is argued to derive through the loss of movement steps. In more technical terms, it is a change from internal to external merge. This change gives rise to the creation of new exponents of functional heads, along with structural simplification (see for example Roberts & Roussou 2003; van Gelderen 2004). The notion of simplification is built on the idea that external merge draws directly on the lexicon, while internal merge draws on lexical items already present in the structure.<sup>1</sup> Thus internal

<sup>1</sup>The change from internal to external merge is rather simplified here. As Roberts & Roussou (2003) point out, this change may involve additional steps, including the "movement" of features from a lower to a higher position; this is, for example, the case with the development of the subjunctive marker *na* in Greek, where the expression of mood changes from being an inflectional/affixal feature to being a modal marker (*na*) in the left periphery, p. 73–87). In all cases though, the change known as grammaticalization is selective, affecting a subset of lexical items, as also pointed out by an anonymous reviewer; a more thorough discussion is provided in Roberts & Roussou (2003: Ch. 5).

### 5 Some (new) thoughts on grammaticalization: Complementizers

merge gives rise to displacement (movement) and requires at least two copies of the same lexical item in different structural positions. The change from internal to external merge implies a single copy in the higher position and elimination of the lower one. This single copy becomes the new exponent of the higher (functional) head. Since merge is bottom-up, it follows that internal merge will also follow this upward (and leftward) path, and the change from internal to external merge will also affect the upper parts of the structure.

In standard terms, irrespectively of the framework adopted, a basic tenet is that grammaticalization involves a change from lexical to functional, or from functional to functional, as in (1):

(1) Content word > grammatical word > clitic > inflectional affix

In (1) above, what appears on the left hand side of the arrow ">" indicates a preceding stage. Assuming that lexical categories (content words) are embedded under functional projections (grammatical morphemes), the order in (1) is consistent with the view that "grammaticalization" is accounted for in a bottom-up fashion. More precisely, a lexical item *α* can start as part of a lexical projection, and by internal merge occur in a higher functional position *β*. The loss of movement steps has an effect in the categorial status of *α*, which now becomes the exponent of *β*. The change from grammatical word to clitic does not affect the functional status but affects the morphosyntactic status of *α*. The same holds for the final stage (from clitic to an inflectional affix), where *α* becomes part of the morphological structure, as best summarized in Givón's (1971: 413) quote "today's morphology is yesterday's syntax".

In the present paper I retain the basic view of Roberts & Roussou (2003) on grammaticalization, namely that it is an epiphenomenon; I also use the term "grammaticalization" in a rather loose way, as the development of grammatical elements out of existing ones. I take complementizers with a pronominal source as the exemplary case, a pattern which is very typical of the IndoEuropean languages. The primary question raised is whether the change from pronoun to complementizer implies categorial reanalysis. The secondary question is whether this reanalysis gives rise to a lexical split that ends up creating homophonous lexical items (i.e., pronoun vs complementizer). The claim put forward here is that the grammaticalized element retains (or at least may retain) its categorial core, thus eliminating homonymy in the lexicon. In §2, I discuss the dual status of some lexical items as pronouns and complementizers, arguing that to a large extent the distinction is functional and not really formal. In §3, I consider the properties of Greek declarative complementizers in connection with their pronominal

### Anna Roussou

uses, showing that we can account for the differences in terms of their logical form (LF) and phonetic form (PF) properties. In §4, I consider the implications of this distribution for grammaticalization, and argue that what looks like a change from pronoun to complementizer indicates a change in selection (expansion) and scope, with visible PF effects also. §5 concludes the discussion.

## **2 On complementizers and pronouns**

Kiparsky (1995) argued that the development of complementizers in Indo-European shows a change from *parataxis* to *hypotaxis*: a previously independent clause becomes dependent on a preceding matrix predicate. This change is linked to a previous one, namely the development of the C position as manifested by V2-phenomena. Another way to view this change is as an anaphoric relation between a pronoun in the first clause which refers to the second (paratactic) clause. Roberts & Roussou (2003) and van Gelderen (2004) argue that in this configuration, the pronoun is reanalyzed as part of the second clause, with the latter becoming part (hypotaxis) of the now main clause since it is embedded under the matrix predicate. This can happen in two steps: first, the pronoun retains its pronominal status and qualifies as a phrase (in a Spec position), and second, it is reanalyzed as a C head, as in (2):

(2) [IP [VP V pronoun]] [IP ] > [IP [VP V [pronoun [IP ]]]]

Roberts & Roussou (2003: 118) argue that although this looks like "lowering", the reanalyzed structure can still be construed in an upward fashion, since the boundary of the second clause shifts to the left (hence upward) to include the pronoun. In their terms, this kind of change is both categorial (pronoun > complementizer) and structural (creating a complement clause headed by the reanalyzed pronoun).<sup>2</sup> A further aspect of this change is that it has created a new exponent for the C head.

The use of pronouns as complementizers is quite pervasive in Indo-European languages. English *that* is related to the demonstrative *that* (*that book*), Romance

<sup>2</sup>Kayne (2005: 238) argues that as a complementizer *that* merges above the VP, while as a pronoun it merges inside the VP, accounting for the fact that as a pronoun it may inflect (in Germanic) for case, while as a complementizer it cannot. Within this framework the change from parataxis to hypotaxis would involve merger of *that* in different positions, signalling embedding under the Kayne's requirement that "For an IP to function as the argument of a higher predicate, it must be nominalized" (Kayne 2005: 236). The complementizer status further implies a silent N.

### 5 Some (new) thoughts on grammaticalization: Complementizers

*que/che* is related to the interrogative pronoun 'what' (*che fai?* 'what are you doing?'), Greek *oti* is related to a relative pronoun, while *pos* is related to the interrogative 'how', to mention just a few examples (see also Rooryck 2013 on French *que* as a single element). In recent approaches to complementation, the relation between pronouns and complementizers is argued to hold synchronically as well. There are basically two ways of analyzing sentential complementation: either to reduce complement clauses to some form of relatives (e.g., Arsenijević 2009; Kayne 2010; Manzini & Savoia 2011), or to reduce relative clauses to an instance of complementation (e.g., Kayne 1994). Either way, the link between the two types of clauses is evident. If indeed there is structural similarity between relatives and complement clauses and the assumption is that complementizers somehow retain their (pro)nominal feature, then what has been considered as categorial reanalysis in the context of grammaticalization may have to be reconsidered.

In their discussion, Roberts & Roussou (2003) point out that one of the differences between D *that* and C *that* has to do with the different complements they embed. In particular, demonstrative *that* takes an NP complement (a set of individuals), while complementizer *that* takes an IP complement (a set of situations/worlds). Manzini & Savoia (2007; 2011) and Roussou (2010) argue that complementizers of this sort are (pro-)nominal. They merge as arguments of the (matrix) selecting predicate and take the CP/IP as their complement; strictly speaking then, they are outside the complement clause. This kind of approach maintains the view that there is embedding, mediated by the "complementizer", but essentially the relevant element, being a pronominal of some sort (demonstrative, relative/interrogative) occurs as the argument of the predicate. On this basis, it is arguable whether the pronoun changes formally or just functionally. To be more precise, the question is how "real" the D > C reanalysis is. The alternative is to assume that the new element classified as a complementizer retains its nominal property, but expands in terms of selection, allowing not only for an NP but for an IP complement as well.

The approach just outlined regarding complementation is very close to Davidson's (1997 [1968]: 828–829) view according to which

sentences in indirect discourse, as it happens, wear their logical form on their sleeves (except for one small point). They consist of an expression referring to a speaker, the two-place predicate "said", and a demonstrative referring to an utterance.

So the sentence in (3a) has the logical structure in (3b):

### Anna Roussou

	- b. Galileo said that: the earth is round.
	- c. Galileo [v/VP said [that [CP/IP the earth is round]]]

The logical structure in (3b) can translate to the syntactic structure in (3c) where the complementizer is the argument of the selecting predicate. If *that* is construed as a pronoun in (3c), then it retains its nominal feature. This is reminiscent of Kayne's (1982) claim that complementizers have the role of turning the proposition to an argument (also Kayne 2005; see fn. 2). It also recalls Rosenbaum's (1967: 25) analysis, according to which complementizers "are a function of predicate complementation and not the property of any particular sentence or set of sentences". In Rosenbaum's analysis, complementizers are introduced transformationally, and complement clauses are sentences dominated by an NP node.

Leaving many details aside, the next question that arises is to what extent the complementizer splits apart from the pronoun it originates from, leading a life of its own. Is this a lexical split that diachronically yields two homophonous elements, e.g. demonstrative *that*-complementizer *that*, interrogative *che*-complementizer *che*, interrogative *pos*-complementizer *pos*, and so on? Homonymy is an instance of accidental overlap in form with clearly distinct meanings. However, the phenomenon here is very systematic within and across grammars and as such it cannot be treated as accidental. If we exclude homonymy (synchronically), we still need to account for the differences between the original pronoun and the derived complementizer. Note that while *che* as an interrogative requires a Q operator, *che* as a complementizer is declarative and incompatible with a Qselecting predicate. The same holds for Greek *pos*, which shows a split between an interrogative and a non-interrogative use, as we will see in the following section.

Interestingly, English *how* shows a similar distribution. Consider the following examples from Legate (2010: 122):

	- b. And don't you start in on how I really ought to be in law enforcement or something proper (www.ealasaid.com/writing/shorts/nightchild.html).
	- c. They told me about how the tooth fairy doesn't really exist.
	- d. \* They told me about that the tooth fairy doesn't really exist.

### 5 Some (new) thoughts on grammaticalization: Complementizers

A clear difference between *that* and *how* is that *how* can be embedded under a preposition, while this is not the case with *that*, as in (4c).<sup>3</sup> Legate argues that *how*-declarative complements are associated with factivity (see also Nye 2013); structurally, they have an abstract DP-layer (c-selection), and semantically they qualify as propositions (s-selection). Unlike *that*, *how* is excluded in relative clauses. The use of *how* as a complementizer does not affect the use of *how* as an interrogative manner adverbial though, as in "*how* did you fix the car?" (= in what manner/way). The question then is whether complementizer *how* is a grammaticalized version of the manner interrogative and a separate entry in the lexicon.

In relation to the above, note that van Gelderen (2015) discusses another use of *how* in matrix yes/no questions, where it remains interrogative (i.e. restricted to questions) but has no adverbial manner interpretation. Consider the following examples (van Gelderen 2015: 164–165):

	- b. How would you mind clearing a blocking path for Brando Jacobs, eh? (https://twitter.com/jimshearer/status/178244064238514177)

As van Gelderen argues, this *how* occurs in matrix yes/no questions, and is neither a manner adverbial nor a complementizer. She further shows that throughout the history of English, *how* was not just restricted to a *wh*-manner adverbial, but also conveyed an exclamative or an emphatic reading. In the latter use it emphatically modifies the modal. Let us illustrate this with the example in (5a). In the manner reading, *how* gives rise to the interpretation "in what way would you like to go to the park?". In the non-manner reading it expresses the degree to which something may hold, giving emphasis on the modal; the reading is something like "Can it be the case that you'd like to go to the park?", that is an epistemic one. As van Gelderen shows, the emphatic interpretation is already attested in Old English *hu*, so this is not an innovation. What is an innovation though is the yes/no reading of the question introduced by *how*.

On the basis of the empirical data, van Gelderen argues for the following steps in the development of *how* in yes/no questions (emphatic/epistemic) and complement clauses (complementizer):

<sup>3</sup>One of the reviewers points out that *how* can be embedded under a preposition because it has a degree feature, which *that* lacks. More precisely, *about* refers to properties which can be provided by the adverbial *how*; *that* refers to truth values, so embedding under *about* results in an empty intersection, hence the ungrammaticality.

### Anna Roussou

	- b. Merge in C [i-degree] (interpretable feature) >
	- c. Spec to Head (not complete for *how*).

The step in (6b) involves a change from internal to external merge with an interpretable feature. The step in (6c) eliminates specifiers in favor of heads. According to her analysis, this state is not complete for *how*, while it is for *whether*. The steps in the reanalysis of *how* affect the features associated with it. More precisely, van Gelderen argues that, as a *wh*-element, *how* has the feature bundle {i-wh, manner/quantity/degree}. The formal *wh*-feature is interpretable and agrees with the uninterpretable *wh*-feature of C in questions, triggering a *wh*question reading. If the interpretable feature of *how* is that of [i-degree], as opposed to [wh], then it is emphatic (i.e., to such a great degree). If this latter feature becomes uninterpretable, then *how* is merged directly in C and *how* qualifies as a complementizer. In yes/no questions, as in (5), *how* has an interpretable polar feature. In this latter context, according to van Gelderen, the Spec-to-Head reanalysis is not complete.<sup>4</sup>

The account provided by van Gelderen (2015) highlights different stages of *how* both diachronically and synchronically, by manipulating the repertoire of features associated with *how* and its structural position. Synchronically, this allows for different functions associated with *how* (from *wh*-interrogative to polar interrogative to declarative factive). One way to account for this is to assume that activating different features gives rise to different interpretations. Instead of treating the different *hows* as distinct elements (homonyms), we can treat all instances of *how* as a single but polysemous element, where polysemy is structurally-conditioned. For example, in the context of a Q operator, the only available reading is that of an interrogative, either as a *wh*-element, or as an epistemic (yes/no questions). If there is no Q operator, then no interrogative reading arises and therefore *how* can only be compatible (*modulo* its degree feature) with a declarative context under selection by a certain class of predicates (hence its factive reading). The distinction between a specifier and a head (complementizer)

<sup>4</sup>One of the reviewers suggests that the degree feature of *how* combined with the pragmatics of verbs like *mind*, *like*, etc., as in "I would SO like to be there", is maintained in yes/no questions as well. Thus (5b) could get the answer "Well, not very much". Van Gelderen models this change in terms of interpretability (an interpretable feature becomes uninterpretable in its new position); I agree with the reviewer, however, that the degree feature remains interpretable. What is crucial is that yes/no questions introduced by *how* implicate an epistemic (or evidential) reading, which shifts the degree feature from the predicate to the proposition. Note also that in all the examples with matrix *how*, the modal *would* is present.

### 5 Some (new) thoughts on grammaticalization: Complementizers

is a function of the syntactic position of *how* and the dependency it forms either with a constituent or a proposition. Note that the interpretation of *how* seems to be affected by the presence or the absence of an operator in the clause-structure, in a way that is reminiscent of polarity-item licensing. The pronoun then acquires its quantificational force, as a *wh*-phrase, through the presence of a Q operator. Once Q is absent, there is no *wh*-reading either, allowing for a declarative use as a complementizer. We come back to this issue in the following section.

In what follows, I will expand the empirical base by considering similar data in Greek which has a range of declarative complementizers with a pronominal (interrogative, relative) counterpart. As will be shown below, this "duality" can give rise to ambiguity in some contexts (recall *how* in 5a).

## **3 The double behavior of pronouns**

In the discussion that precedes we saw that a clear-cut distinction between pronouns and complementizers is not so obvious. To put it differently, as the discussion in van Gelderen (2015) shows, the non-manner uses of *how* are also attested in earlier stages of English, so this is not an innovation. One way to understand this is as follows: the non-manner readings are compatible with a core interpretation of *how* that allows it to modify manner in qualitative terms as well (degree > emphasis). The interrogative use depends on the activation of the *wh*-feature in the scope of a Q operator; in fact, it only arises in the scope of Q. The issue of categorial reanalysis now emerges in clearer terms: does it really exist, and if so to what extent? It is interesting to mention that in a framework where lexical items are considered as feature bundles in the lexicon (Chomsky 1995), categorial classification can be viewed in a different perspective, as will be shown below.

Bearing the above in mind, let us now consider Greek which has a range of declarative complementizers. Along with *oti* ('that'), we also find *pos* ('how'). This looks very much like English *that* and *how*. There is a crucial difference though: *oti* and *pos* seem to be in free variation and are selected by the same predicates (note that some dialects may show a strong preference for *pos* instead of *oti*). Greek possesses a third declarative complementizer, namely *pu* (lit. 'where') which is selected by factive predicates (Christidis 1982; Roussou 1994; Varlokosta 1994). The complementizer *pu* also introduces restrictive and non-restrictive relative clauses, where *oti* and *pos* are excluded:<sup>5</sup>

<sup>5</sup>*How* can also be used in relative clauses, in examples like "The way how she walks". The equivalent construction in Greek would use the relative pronoun *opos*, which has the prefix *o*and the *wh*-pronoun *pos* (lit. 'the how').

### Anna Roussou

	- a. nomiz-o think-1sg oti/pos that/that kerdhis-e won-3sg to the vravio prize 'I think that she won the prize.'
	- b. thimame remember-1sg pu that kerdhis-e won-3sg to the vravio prize 'I remember that she won the prize.'
	- c. o the fititis student pu that sinandis-es met-2sg ine is filos friend mu mine 'The student that you met is my friend.'

Greek then has a two-way distinction of three complementizers: *oti/pos* and *pu*. The two-way distinction involves factivity and relativization. Specifically, *pu*complements are associated with a factive interpretation, while *oti/pos*-complements are selected mainly by non-factives and only some factives (Christidis 1982; Roussou 1994). So there is a one-way implication between sentential complementation and factivity, since not all factive complements are introduced by *pu*. With respect to relativization, *pu* is the only complementizer available; as will be shown immediately below, free relatives behave differently (and exclude *pu*).

Considering *pos* and *pu* in more detail, we observe that they also correspond to *wh*-pronouns, as in the following examples:

(8) Greek

a. pos how tha fut fij-is? leave-2sg 'How will you leave?'

b. pu where tha fut pa-s? go-2sg 'Where will you go?'

c. pu where to=edhos-es it=gave-2sg to the vivlio? book 'Where/to whom did you give the book?'

In (8) both *pos* and *pu* occur in matrix questions. They may also introduce embedded *wh*-interrogatives. Both sentences have a *wh*-question (rising) intonation. As (8c) shows, *pu* as an interrogative apart from the locative reading, it

### 5 Some (new) thoughts on grammaticalization: Complementizers

may also realize an indirect (oblique) *wh*-argument. Finally, it is crucial to mention that although *oti* does not have a *wh*-counterpart, it has a relative pronoun one, which in orthographic terms is spelled as *o,ti* (lit. 'the what'). As a relative pronoun, it is found in free relatives with an inanimate referent, and is excluded from restrictive and non-restrictive relative clauses (the relevant examples are given below).

The picture we have so far with respect to the distribution of English and Greek complementizers and their pronominal counterparts can be summarized as in Table 5.1.


Table 5.1: Pronoun and complementizers (Greek and English)

A quick look at Table 5.1 shows that all five elements qualify as declarative complementizers, despite their different feature specifications. It further shows that all of them have a pronominal use as well, despite differences again. Based on this pattern, I will assume that their core defining property is that of N, i.e., they are essentially nominal elements (see also Franco 2012), which can be construed with different features (D, wh, etc.) or different functional layers (Baunaz 2015). In this respect, their core (minimal) categorial content is N – very much like indefinites; this property can account for the fact that they may also distribute like indefinites, subject to operator licensing (polarity-like). I leave this issue open for the time being.

Let us now consider the following sentence (I leave *oti* unglossed on purpose in the following example):

(9) Greek

pistev-i believe-3sg oti oti dhjavas-e read-3sg i. 'He believes that he has studied.'

ii. 'He believes whatever he has read.'

### Anna Roussou

The two sentences above exemplify two different readings. In (9i) *oti* is a complementizer that introduces the complement clause of *pistevi*. In (9ii) it is a relative pronoun construed as the argument (object) of *dhjavase*. The whole clause introduced by *oti* (or *o,ti*) is the internal argument (object) of *pistevi*. What is responsible for this ambiguity? First, a verb like *pistevi* 'believe' can take either a noun or a clause as its complement; second, the embedded verb *dhjavazi* 'read' can take an implicit argument. So in (9i), the matrix verb selects a sentential complement, and the embedded verb has an implicit argument. In (9ii), on the other hand, the matrix predicate selects a free relative (akin to a noun phrase) while the argument of the embedded verb is not implicit but present in the form of the displaced pronoun. The string of words in *pistevi oti dhjavase* is ambiguous between a complement clause (where *oti* functions as a complementizer) and a relative clause (where *oti* functions as a free relative pronoun). Thus the surface string of words in this case corresponds to two different syntactic configurations.

Similar examples hold with the other two elements, namely *pos* and *pu*, as below (again left unglossed):

(10) Greek

paratiris-a observed-1sg pos pos jiriz-i spin-3sg o the troxos wheel i. 'I observed that the wheel was spinning.' ii. 'I observed how the wheel was spinning.'

(11) Greek

emath-a learnt-1sg pu pu perpatis-e walked-3sg i. 'I learnt/found out that he had walked.' ii. 'I learnt/found out where he had walked.'

In the absence of any PF-indication (prosody), each of these lexical items can be construed as a declarative complementizer (i), or a pronoun (ii).

In all examples (9–11) so far, the complementizer construal is possible to the extent that a declarative complement is selected by the matrix predicate. In (10–11) for example, if instead of *paratirisa* and *ematha* accordingly we have an interrogative predicate, such as *rotisa* ('asked'), then only a *wh*-interrogative reading is available, as expected. So ambiguity arises in certain contexts only. The second property we need to point out is that the interrogative reading in these examples (and accordingly, the free relative in (9)) depends on the availability of a variable in the complement clause, that is an open position that modifies the predicate for manner, place, etc. (or as an implicit argument in (9) for the free relative reading).

### 5 Some (new) thoughts on grammaticalization: Complementizers

I will assume, in line with work in the recent literature (Manzini & Savoia 2007; 2011; Roussou 2010; Franco 2012), that as a complementizer each of these elements merges as the argument of the verb and takes the CP/IP as its complement. On the other hand, as a pronoun it is internal to the embedded clause, at least to the extent that it has a copy inside the clause (at the v/VP level), as illustrated below:

(12) a. V *oti*/*pos*/*pu* [CP/IP …] b. V [*o,ti*/*pos*/*pu* [CP/IP … *o,ti*/*pos*/*pu* ]]

The different configurations map onto different PFs; thus the ambiguity is resolved prosodically. As complementizers, *oti*, *pos*, and *pu*, are unstressed, i.e., they do not form a prosodic unit. As pronouns, however, they are stressed, in a manner typical of *wh*-questions; i.e. the pronoun defines an L\*+H prosodic unit. This holds for all three cases, including the *o,ti* relative function. The pattern with *pu* as a complementizer has one more interesting angle: as expected, *pu* is unstressed, but the preceding predicate is stressed (an L\*+H prosodic unit). In other words, selection of a *pu*-complement in this context is associated not only with the semantics of the selecting predicate but also with focus. As expected, focus on the predicate turns the *pu*-complement to the presupposed part, hence its association with factivity (on the interaction of focus with factivity, see Kallulli 2006).

What we observe so far is that the lexical items under consideration have two phonological variants: a stressed one (pronominal) and an unstressed one (complementizer). This kind of alternation is quite common in the pronominal system. For example, in Classical Greek the indefinite pronoun *tis* has an accented variant (*tís*) as an interrogative (also Latin *quis*); in (Modern) Greek negative polarity items like *kanenas* ('anyone') and *tipota* ('anything') acquire a status of universal quantifiers (negative quantifiers) when focused. So the different categorizations of *oti*, *pos* and *pu* as pronouns vs complementizers in relation to their phonological properties comes as no surprise in this respect. But does this property suffice to classify them as distinct lexical items synchronically? The answer seems to be negative, given that their differences can be accounted for independently.

Assuming that the use of *pos* and *pu* as complementizers is an innovation in the path of their diachronic development, the question is whether grammaticalization is at stake or not. So far, I have argued that, strictly speaking, there is no categorial reanalysis as such, in the sense that in either function, these elements retain their nominal core. The activation of the *wh*-feature depends on the presence of a Q operator and involves focusing of the item in question. If this is correct, the interrogative reading is syntactically defined and is read off at the

### Anna Roussou

two interfaces (it introduces a variable at LF, it defines a prosodic unit at PF). On the other hand, as complementizers, they are selected by designated predicates and in turn they select a proposition. The complementizer is externally merged, subject to selection by the matrix predicate. This structure is accordingly read off at LF, given that the complementizer turns the clause to an argument, and at PF, since it has not prosodic properties. This latter characteristic is in accordance with the notion of phonological reduction attested in grammaticalization. What about semantic weakening (or bleaching)? As a complementizer, the pronoun retains its nominal core, and has no additional features (like *wh*-). At the same time, under its complementizer use, the lexical items under consideration expand, on the assumption that they manifest a wider choice in terms of selection; for example, i.e., selection of an NP or a clause (CP/IP). Note that the complementizer status assigned to *oti* is not an innovation, since it is used as a complementizer throughout the history of Greek.

The above properties can be summarized as follows:

	- b. Complementizer: no prosodic unit, external merge.

The development of the complementizer use for *pos* and *pu* under the current approach is consistent with the "change" from internal to external merge. As already pointed out, as displaced pronouns due to internal merge, they bind a lower copy and take scope. As complementizers they merge externally and therefore do not bind a copy. Does this approach account for the idea of "upward reanalysis" of Roberts & Roussou (2003)? Recall that in relation to the schema in (2), Roberts & Roussou assume that this involves a leftward shift of the clause boundary; this means that while the pronoun as a complementizer literally lowers, since it becomes part of the embedded clause, the boundaries of the embedded clause move upwards to include the reanalyzed pronoun. This account though has some shortcomings, given that it takes "upward" in linear and not structural terms. In terms of the claim made in the present paper, the upward reanalysis is accounted for structurally: the pronoun as a complementizer merges as the argument of the predicate (as was before), and the paratactic clause becomes embedded under the pronoun, triggering the change from parataxis to hypotaxis. The pronoun nominalizes the second clause, which now qualifies as an argument. The relation between the pronoun and the clause changes from being anaphoric to being an instance of complementation.

In short, the presentation of the data above points towards a unified account of pronouns and complementizers. The basic line of reasoning is the following: if

### 5 Some (new) thoughts on grammaticalization: Complementizers

two grammatical lexical items look the same, they are (most probably) the same. Further evidence is provided by the fact that this similarity is diachronically and synchronically supported. Diachronically, because we can trace the steps in the development of complementizers, and synchronically, because it is very systematic across grammars, but also within a given grammar, to be simply treated as accidental (as is the case with homonymy).

## **4 Grammaticalization and syntactic categories**

The above discussion on (Greek) complementizers has concentrated on the connection between the "new" functional item and its lexical source. So the question has been whether complementizers retain their core nominal feature or not. So far, I have talked about complementizers and pronouns, assuming that the latter occur in the left periphery of the clause, while the former (potentially) as arguments of the selecting predicate. I have made no reference to the C head as such though. In fact, the approaches that assign a nominal feature to complementizers distinguish it from the C positions as such. If C is a position retained for verbal elements that is part of the (extended) projection of the verb (Manzini & Savoia 2011), then it is not and cannot be realized by nominal-type elements (such as pronouns or complementizers, unless the latter are verbal-like). This line of reasoning allows us to maintain that the pronoun to complementizer reanalysis is not an instance of categorial reanalysis. In other words, it is more of a functional change (affecting the use of the pronoun) and less so of a formal one.

This issue of categorial reanalysis arises in all contexts of grammaticalization. For example, when verbs become modals, do they retain their verbal feature? Does *for*, as a complementizer in English, retain its prepositional feature? Is the infinitival marker *to* in English the same as the preposition *to*? This question can be obviously asked for every single case of grammaticalization, and it is related to the nature of syntactic categories, their repertoire and feature specification. The answers to the question just raised can vary. Consider the case of *for* as in the following example (for a historical account, see van Gelderen 2010):

	- b. I prefer for Mary/her to be late

In descriptive terms, *for* in (14a) is a preposition which takes a DP complement (*Mary*) or an accusative pronoun (*her*). In (14b), on the other hand, *for* introduces the infinitival complement, and is usually analyzed as a C element. However, it

### Anna Roussou

can still assign accusative case to the embedded subject (*Mary/her*), at least under standard assumptions in the generative grammar. If so, then it maintains its prepositional property of being a case assigner. This has been further supported by the fact that in Standard English at least, *for* forces the presence of an overt subject and excludes a null one (that is a PRO subject), as in \**I prefer for to be late* vs. *I prefer to be late*. Based on similarities of this sort, (Kayne 1984; 2000) pointed out the affinity between prepositions (P) and complementizers (C), but also determiners (D). This then turns out to be a recurrent theme in the literature.

In the light of the present discussion, the link between apparently different categories is not surprising. To be more precise, if *for* is a preposition in (14a) there is no particular reason why it cannot be a preposition in (14b). The difference between the two instances has to do with the different complements *for* takes in either case. That prepositions can introduce subordinate clauses is rather well established, and can be further illustrated with the following examples:

	- b. After we had dinner, we went for a walk

Once again, the same element can take different types of complements: a DP or a clause (finite or non-finite). Unlike *for*, *after* can only select for a finite clause, does not interfere with the realization of the embedded subject, and can only introduce adverbial (non-complement) clauses.

Having provided a discussion of the relation between pronominals and complementizers (potentially extending this to prepositions as well), let us discuss a bit more the question that we first raised, namely that of lexical splitting in the context of grammaticalization. Related to this is the categorial identification of the new lexical item. As discussed in the literature, there are many cases where the limits between two categories are not very obvious, or "fuzzy" (see Traugott & Trousdale 2010). In typological approaches to grammaticalization, where grammatical categories are under formation, the notion involved is that of *gradience*. However, in formal approaches where grammatical categories are defined as bundles of features with a role in the syntactic computation, the notion of gradience is problematic. Roberts (2010) partly overcomes this problem by assuming an elaborate functional hierarchy, along the lines of Cinque (2006), allowing for the possibility for the same lexical item to merge on different heads along this hierarchy. This has the advantage of maintaining a core property, thus avoiding the issue of homonymy, while at the same time it derives the different meanings by merger of the same item in different positions. So in this respect, what looks like a lexical split has a syntactic explanation: the same lexical item can realize different positions along the functional hierarchy. One disadvantage of this

### 5 Some (new) thoughts on grammaticalization: Complementizers

approach is that it requires every possible meaning to be syntactically encoded, introducing an immense increase of functional categories, even for those cases where certain readings can be derived pragmatically.

Another issue that opens up under the current approach concerns the lexical vs. functional divide. If so-called grammaticalized elements can retain their verbal or nominal (in a broad sense) features, then the question is what sort of implications this has for the lexicon, the syntax and the view of parametric variation, among other things. One possible answer is that this basic distinction between two classes of lexical items is not primary but secondary. This is consistent with the view that the lexicon consists of lexical items with no a priori characterization (see Marantz 2001). If something is a predicate, i.e. assigns a property or expresses a relation, it has all the typical characteristics to qualify as a core lexical category. Languages allow these elements to generate quite freely in the lexicon. At the same time, whether an element functions as a predicate or not is to a large extent determined configurationally under current minimalist assumptions. Similarly, whether the same element is "less lexical" or "more functional" is also determined configurationally. This is indeed captured in Cinque's (2006) approach, and is to some extent implicit in Roberts's (2010) account. So "more functional" in current terms is understood as being associated with a high position (and scope) in the clause structure. But if this is correct, it does not really tell us much about this distinction as an aspect of the lexicon. As Manzini & Savoia (2011: 5) put it "There is no separate functional lexicon – and no separate way of accounting for its variation".

Consider again the case of verbs, which are typical examples of predicates in natural languages. The verb expresses its argument structure in connection with certain positions, realized by nominals (giving rise to expressions of transitivity, case, etc.). The typical I and C positions associated with the verb are essentially scope positions (relating to the event, the proposition, or various types of quantification over possible worlds, etc.). Nominals also have a predicative base, carry inflectional properties and become arguments in relation to a predicate. What actually lies in the heart of this discussion is the categorization of concepts. Different choices give rise to different lexica cross-linguistically. Interestingly, it is in this respect that grammaticalization in functionalist frameworks makes sense, since the idea is that concepts acquire a grammatical form and consequently grammatical categories are defined functionally. In formal approaches, syntactic (grammatical) categories are meant to be well-defined, and languages differ as to which concepts map onto which categories and how. This raises the question of how well-defined categories are. Looking at complementizers and how they

develop out of pronouns can shed some light into this question. So is complementizer a formal category after all, or is it a functional classification of a nominal (or a verbal in other languages) element? grammaticalization phenomena then allow us to have a better understanding of how syntactic category could be defined, how they are realized cross-linguistically and how they are manipulated by narrow syntax.

In short, grammaticalization phenomena can tell us something about the diachronic development of grammatical elements, especially with respect to morphosyntax. At the same time, they force us to pay closer attention to what actually a syntactic category is. The answers are not easy either way, but the empirical data is there to be further explored. Taking a view towards grammaticalization along the lines suggested here, where its core property of categorial reanalysis is put into question, invites us to reconsider syntactic change and focus more on how certain elements change the way they do, even if they retain their categorial status (thus no categorial reanalysis).

# **5 Concluding remarks**

In the present paper I have mainly focused on the notion of categorial reanalysis, and in this respect I have outlined an account which, at least to some extent, casts some doubts on this standard view. The empirical set of data was restricted to the development of complementizers out of pronouns. The basic argument has been that formally, the innovative element, namely the complementizer, retains its nominal categorial feature. In its new function as a complementizer, the pronoun externally merges with the selecting predicate. The change attested involves properties that affect the interfaces, such as phonological reduction and selectional/scope requirements.

# **Abbreviations**


5 Some (new) thoughts on grammaticalization: Complementizers

## **Acknowledgements**

This paper is dedicated to Ian Roberts on the occasion of his 60th birthday. It reflects on our joint work on grammaticalization, adding a new angle on categorial reanalysis. Working with Ian has been inspirational and a good source of agreements and (productive) disagreements! The present version of the paper has benefited from the constructive comments of two anonymous reviewers. I thank them both.

## **References**


### Anna Roussou


Kayne, Richard S. 1994. *The antisymmetry of syntax*. Cambridge, MA: MIT Press.


Lightfoot, David W. 1998. *The development of language: Acquisition, change, and evolution*. Oxford: Blackwell.


5 Some (new) thoughts on grammaticalization: Complementizers


# **Chapter 6**

# **Little words – big consequences**

# Lisa Travis

McGill University

This paper investigates the interaction of E-language and I-language within the context of the macro- vs. micro-parameter debate. It presents a case study of variation found in the focus construction in Western Malayo-Polynesian languages, Tagalog, and three dialects of Malagasy — Merina, Bezanozano, and Betsimisaraka. The grammatical role of the functional element that appears directly after the focused element, which is only subtly indicated in the E-language, turns out to be crucial as its role can have significant repercussions in the I-language. More specifically, depending on whether this element is a determiner, a relativizer, or a complementizer, the construction itself can vary between a pseudo-cleft construction and a cleft construction. The hypothesis is made that the shift from the pseudo-cleft to the cleft construction opens the door to a possible reanalysis of these verb-initial languages as having SVO word order.

# **1 Introduction**

… study of the principles of syntax is not and cannot be a separate enterprise from study of the parameters. (Kayne 2005: 9)

It is hard to separate the study of syntax from the study of parameters. In the 80s and 90s, interest was in macro-parameters such as bounding (Rizzi 1982), prodrop (e.g. Chomsky 1981), and word order. More recently, interest has turned to Kayne (2005). In a system that recognizes I(nternal)-language and E(xternal) language, we find a tension is created between macro-parameters and microparameters. Macro-parameters are best suited to explain the speed of language acquisition. Acquiring one smaller language detail will entail that many other language facts will follow because one parameter will account for a cluster of

Lisa Travis. 2020. Little words – big consequences. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 113–132. Berlin: Language Science Press. DOI: 10.5281/ zenodo.3972838

### Lisa Travis

language-specific phenomena. If one were to design the perfect I-language system, a system of macro-parameters would appear to be the most efficient way to go. However, we know that language changes gradually given that the Elanguage between two generations on the chain of language change will have to be mutually intelligible. So as far as E-language goes, a system of micro-parameters would appear to be the right way to go.

In this paper I argue that a small surface difference in the E-language might well indicate a large difference in the I-language. This would allow shifts in a macro-parameter that could well not interfere with mutual intelligibility. The particular change that I will be investigating is a hypothesized change from VOS to SVO in Austronesian. I will look at a focus construction in three dialects of Malagasy<sup>1</sup> – Merina, Bezanozano, and Betsimisaraka – and compare this to its Austronesian cousin, Tagalog. The claim will be that while Tagalog and Bezanozano, the most conservative Malagasy dialect of the three, can be argued to use pseudo-clefting for their focus construction, both Merina and Betsimisaraka appear to have moved to a cleft construction, which I argue makes them closer to becoming SVO languages. The important part of this proposal is that this shift all rests on the analysis of one functional category – a very small surface difference that points to a substantial underlying difference.

## **2 Clefts and pseudo-clefts**

In this section I give some background data on the relevant construction and I introduce the issue of distinguishing between pseudo-clefts and clefts in predicateinitial languages that lack copulas and expletives. I will argue that it is lack of transparency in these constructions that leads to reanalysis and language change. All of the languages/dialects under investigation are predicate-initial, but all have a focus construction in which a designated DP, which some analyses label the subject, appears sentence-initially. My argument will be that it is this construction that can eventually undergo reanalysis as a pure SVO structure. Whether or not it is susceptible to reanalysis will depend on how salient the signs are in this construction that the language remains predicate-initial. If the construction is clearly marked as a pseudo-cleft, its predicate-initial status will be clear. If the construction is a cleft construction, it will be subject to reanalysis. Why this is so will be explained in this section.

<sup>1</sup>Malagasy is the name of a variety of dialects spoken in Madagascar by about 18 million people.

6 Little words – big consequences

### **2.1 Background data**

Tagalog, the most well-documented language spoken in the Philippines, is clearly verb-initial with variable word order following the verb. As I will be comparing Tagalog to the Malagasy dialects, I give a brief overview of its focus construction here. In the Tagalog clause, there is a designated argument that I will call the Pivot, that is marked by the particle *ang*. 2 In (1) below, we see that the sentence begins with the verb *bumili* 'buy' and that the Agent, acting as the Pivot, appears with the particle *ang*.

(1) Tagalog

Bumili at.buy ng acc bigas rice **ang** nom **babae** woman 'The woman bought rice.'

In order to create the focus construction, the *ang* DP is fronted and that fronted DP is followed by another particle *ang*. 3

(2) Tagalog **Ang** nom **babae** woman ang nom bumili at.buy ng acc bigas rice 'It is the woman who bought rice.'

The Merina dialect of Malagasy<sup>4</sup> also has a Pivot DP, in this case indicated by its sentence-final position.

(3) Merina

Manasa prs.at.wash ny det lambanay clothes.1pl.excl **Rakoto** Rakoto 'Rakoto is washing our clothes.'

In a focus construction, this Pivot DP appears sentence-initially and is followed by the particle *no*.

<sup>2</sup>There are debates about the syntactic status of the *ang* DP, whether it is the subject, the topic, or the absolutive marked argument. In a parallel fashion, there is a debate about what the particle *ang* is – nominative case, default case, or absolutive case. What is important for the purpose of this paper is that it is a functional category that is part of the nominal extended projection.

<sup>3</sup> I will be using boxes to highlight the "little words" referred to in the title of this chapter at relevant points.

<sup>4</sup>Merina is the main dialect, very close to what is called Official Malagasy, and is spoken in the capital region.

### Lisa Travis

(4) Merina **Rakoto** Rakoto no *no* manasa prs.at.wash ny det lambanay clothes.1pl.excl 'It is Rakoto who is washing our clothes.'

The focus of this paper will be this construction and more specifically the role of the particle that follows the focussed DP. I will argue that this particle can be a nominal functional category (as we will see for Tagalog) or a verbal functional category (as we will see for the Merina dialect of Malagasy) and that the former indicates a pseudo-cleft construction while the latter indicates a cleft construction. We will see that in the pseudo-cleft construction, the clause remains firmly predicate-initial, while in the cleft construction, the word order within the clause is less obvious and therefore susceptible to reanalysis.

### **2.2 Discovering (pseudo)-clefts**

The first goal of the paper is to show that these constructions are clefts of some form. In order to do this, I follow arguments taken from the literature on Malagasy (e.g. Keenan 1976; Paul 2001; Pearson 2009; Potsdam 2006; Law 2007). The first task is to show that the sentence-initial DP is preceded by a (silent) verb. Using examples from Merina, we can see below that both negation (5) and the raising predicate *toa* 'seems' (6) can precede the DP. Since both negation and raising predicates select verbal projections and not DPs, the conclusion has been made that there is a covert copula preceding the focussed DPs.

(5) Merina

**Tsy** neg Rakoto Rakoto no *no* manasa prs.at.wash ny det lambanay clothes.1pl.excl 'It isn't Rakoto who is washing our clothes.'

(6) Merina

**Toa** Seems Rakoto Rakoto no *no* manasa prs.at.wash ny det lambanay clothes.1pl.excl 'It seems to be Rakoto who is washing our clothes.'

While remaining silent on what the structure is that follows the DP as this is the topic of the paper, we know that the first part of the construction contains an unrealized verb.

(7) [ Neg/RaisingV [ ⟨cop⟩ ] DP … ]

### 6 Little words – big consequences

Now we take a brief excursion to discuss the distinction between clefts and pseudo-clefts in predicate-initial languages, why the distinction is very subtle, and why this distinction is important to the issue at hand. We start with an English cleft where an object, (8a), or a subject, (8b), has been extracted. Eventually we will look only at subject extraction so I have put that example in bold.

	- a. It is a small dog that the child saw.
	- b. **It is a small dog that saw the child.**

Now we look at pseudo-cleft (9a). In order to create a structure that works well with subject extraction which is crucial in our discussion of the change in word order from VOS to SVO, I change the construction slightly in (9b) by substituting *what* with *the thing*. I am assuming that this change does not make any relevant difference in the structure itself. Finally we see this structure with subject extraction in (9c) as this is what we will be comparing with the Malagasy structure.

	- a. **What** the child saw is a small dog.
	- b. **The thing** that the child saw is a small dog.
	- c. **The thing that saw the child is a small dog.**

In this exercise we will compare only the subject clefts (8b) and pseudo-clefts (9c) since these are the two constructions resembling most closely the Tagalog/Malagasy structures that we will encounter. In these languages, extraction is for the most part restricted to the Pivot DP. In order to simplify the discussion, we will start by focusing our attention only on sentences where the Agent is the Pivot.

*Step 1*: Our first task in understanding what our expectations are for clefts and pseudo-clefts in Malagasy and Tagalog, both predicate-initial languages, is to determine what we expect the order of elements to be. In order to do that, we first separate predicate from subject in clefts (10a) and pseudo-clefts (11a) and then front the predicates in the English examples (10b) and (11b).<sup>5</sup>

(10) Cleft


<sup>5</sup> In examples (10–14), subjects are in bold-face, predicates are in italics. In examples (10–15), unpronounced material is set in angled brackets.

(11) Pseudo-cleft


*Step 2*: Because we know that these languages do not have overt copulas, we can take these out of our expected structures.


*Step 3*: Because we know that these languages do not have expletive subjects, we can take these out of the relevant expected structures (i.e. the cleft).


*Step 4*: Because we know that these languages have headless relatives, we can the head of the relative out of the relevant structure (i.e. the pseudo-cleft).


When we put the remaining pieces of the cleft and the pseudo-cleft side by side, we can now see (a) how minimally different these are on the surface yet (b) how dissimilar they are in the underlying structure. Both begin with a DP followed by some functional material and it is within this functional material that we get the only clues as to whether we are dealing with a cleft (C) or a pseudo-cleft (PC) construction. The only distinguishing elements are, in English, the complementizer *that* for the cleft and the determiner *the* and the relativizer *that* for the pseudo-cleft. Yet structurally these two constructions are very different with the cleft construction having the predicate *e the small dog that saw the child* and no pronounced subject while the pseudo-cleft has the predicate *e the small dog* and the subject *the e that saw the child*.

(15) C: PC: [ [ ⟨is⟩ ⟨is⟩ **a small dog a small dog** ] [ that the ⟨thing⟩ that **saw saw the the child child** ][ ] ⟨it⟩ ]

Now the question is why this is so important. I will argue that this distinction is crucial in the shift from a VOS language to an SVO language. Notice that only in the pseudo-cleft do we get information on where the subject is, and this information confirms that the language is predicate-initial (subject-final). In the cleft

### 6 Little words – big consequences

structure, since the (expletive) subject is not pronounced, we have no indication as to whether the structure is SVO or VOS. Note also if, for some reason, the functional category is not realized, we are left with the remaining elements *the small dog saw the child*, in other words a simple SVO sentence. The lack of information of the cleft construction and the fragility of these functional categories will become important later in the paper when I speculate on how languages move from a VOS word order to an SVO word order.

Having derived some word order expectations from this exercise, we return to the issue of the languages/dialects under study. Since the functional words that follow the sentence-initial DP are crucial in determining whether the focus constructions are clefts or pseudo-clefts, they will become the target of the investigation. To not prejudge the questions, I will for now just call these functional words particles. The question will be whether these particles are part of the nominal extended projection or the verbal extended projection. I will end up classifying them into three types deriving from the three functional elements we find in (15) – the nominal particles (such as *the*), the relativizing particles (such as *that*), and the complementizer particles (such as *that*). To make it even clearer how difficult this is, we can think of English and the demonstrative *that*, the relativizer *that*, and the complementizer *that*. Very slight differences in pronunciation (where the relativizer and the complementizer *that* but not the demonstrative *that* may have a reduced vowel) and position can indicate quite different structures.

## **3 Tagalog and the Malagasy dialects**

In this section I will be comparing the different particles that we find in the focus constructions in Tagalog and three Malagasy dialects – Merina (Official Malagasy), Bezanozano, and Betsimisaraka. By seeing how they behave in other parts of the grammar, I hope to determine whether they are part of the nominal extended projection, a relativizer, or a complementizer (a part of the verbal functional projection).

### **3.1 Tagalog**

Tagalog immediately makes it fairly clear which particle we find following the focussed DP. We do not have to look very far to see that the particle *ang* is used as a nominal marker.<sup>6</sup> Below I have repeated our basic Tagalog sentence from above,

<sup>6</sup> For more details on Tagalog see Aldridge (2013), Kroeger (1993), Richards (1998), and for Polynesian languages, Potsdam and Polinsky (2011).

### Lisa Travis

as well as the focus construction. In the basic clause (16a) we see *ang* appearing as a nominal marker on the Pivot DP. In (16b), *ang* appears twice, once before the now focussed and fronted Pivot DP, and once following this DP acting as the focussing particle.

(16) Tagalog


There have been a variety of analyses of *ang* which co-vary with the analysis of the syntactic structure of Tagalog clauses. However, whether it is nominative case marker, an absolutive case marker, a Topic marker, or a determiner, it is a functional head along the extended projection of the noun. As for its other uses in the grammar, we can see below that when it precedes a predicate that is missing its Pivot DP, it creates a DP which refers to the missing argument. In (17) below we see the predicate *bumili ng bigas* 'buy rice' preceded by *ang* and it means something like 'the one who bought rice' or 'the rice-buyer'.

(17) Tagalog Pagod tired **ang** nom *bumili* at.buy *ng* acc *bigas* rice 'The one who bought rice is tired.'

The verb can appear in a different form (the Theme Topic form) changing the Pivot from the Agent to the Theme as in (18a). When this form of the predicate is preceded by *ang*, it now means something like 'the thing that was bought by the woman' or 'the woman's bought thing'.

(18) Tagalog


### 6 Little words – big consequences

While one of the translations given above is a headless relative, we know that *ang* is not the relativizer itself. When we do have a relative clause, the *ang* appears before the head of the relative, and the relativizer has a different form, either *ng* or *na*. This form, sometimes called a linker, is also used between a nominal head and an adjective (see 19c and 19d).

(19) Tagalog


A plausible analysis for the focus construction, then, is one where the material following the focus particle is some sort of nominal that I will translate as 'the x that ...' – the translation that I have given to the pseudo-cleft in (9c) above. I repeat our Tagalog focus construction below and give it now a pseudo-cleft translation.

(20) Tagalog

**Ang** nom **babae** woman ang nom bumili at.buy ng acc bigas rice 'The one who bought rice is the woman.'

The predicate of the clause is an unpronounced copula followed by the DP *ang babae* 'the woman', and the subject of the clause is *ang bumili ng bigas* 'the one who bought rice'.

A construction that will become important in our determination of the nature of the focus particle is the focussed PP construction. Tagalog and all of the three Malagasy dialects that we are comparing allow PPs to be fronted and focussed. We see the Tagalog PP Focus below. Note that when the focussed constituent is a PP, the focus particle *ang* is disallowed.

### Lisa Travis

(21) Tagalog

Sa prep palengke market (\*ang) (nom) bumili at.buy ng acc bigas rice ang nom babae woman 'It was at the market that the woman bought rice.'

In fact, the inability to have a nominal functional category in this position makes sense because it is not clear what this nominal phrase would refer to. There is no missing Pivot in the material following the focussed element. What is missing is a PP but this is not nominal. Looking at English clefts and pseudo-clefts, we can see that with clefted PPs, it is sufficient to just have the complementizer *that*. However, with pseudo-clefts, we need to have the relevant wh-word to give the PP meaning. Notice that with a relative clause in English, we cannot drop the wh-word the same way that we can with DP arguments.

	- b. *Where* I bought rice was at the market.
	- c. That is rice *which*/**that** the woman bought.
	- d. That is the woman *who*/**that** bought rice.
	- e. That is the market *where*/**\*that** the woman bought rice.

Likewise in Tagalog, a DP relative clause head that would originate within a PP in the embedded clause needs to be followed by a complementizer and a contentful wh-word (here *kung saan* 'if where'). It cannot simply be followed by the linker as was the case in the relative clause constructions given in (19).

(23) Tagalog

Malayo far ang nom palengke-\*ng market-lnk / kung if saan where bumili at.buy ng acc bigas rice 'The market where the woman bought rice is far.'

I would argue, then, that in Tagalog, when the Pivot is focussed, we have a pseudo-cleft construction signaled by the nominal functional category *ang*. When the PP is focussed, however, we have a cleft construction. What is important for the purpose of this paper, however, is that there is no mistaking a focus construction as having an SVO word order. If a DP Pivot appears sentenceinitially, it is clearly followed by a DP signalled by the presence of *ang*.

### **3.2 Merina (Official Malagasy)**

Now we turn to Merina, the most documented dialect of Malagasy. Since I will be comparing it to other dialects of Malagasy, I will identify it as Merina. We see below that the focus particle is *no*. This particle is much more difficult to categorize.

(24) Merina

**Rakoto** Rakoto no *no* manasa prs.at.wash ny det lambanay clothes.1pl.excl 'It is Rakoto who is washing our clothes.'

Unlike *ang* in Tagalog, the particle *no* in Merina is not used as a nominal functional category. We can see below that while the determiner, *ny*, is very similar in form, *no* cannot be used in its place.

(25) Merina mangatsika cold **ny/\*no** det tranoko house.1sg.gen 'My house is cold.'

Given this, it is not surprising that *no can* be used with focussed PPs.

(26) Merina Amin'ny with.gen.det penina pen **no** *no* manorotra prs.at.write aho 1sg.nom 'It's with a pen that I am writing.'

The fact that it can be used with a focussed PP correlates with what we have seen in Tagalog. I argued that *ang* couldn't appear with a focussed PP precisely because it was a nominal functional category. Since we have seen that Merina *no* is not nominal, we would expect no clash with the PP.

Having seen that *no* is not nominal, we now can see that it is also not a relativizer. The relativizer in Merina is *izay*.

(27) Merina

vizaka tired ny det lehilahy man (**izay**)/\*no rel manasa prs.at.wash ny det lambanay clothes.1pl.excl 'The man who is washing our clothes is tired.'

### Lisa Travis

The question arises, however, where else the particle *no* can appear. Interestingly, it is used to link two clauses together with a variety of effects (see Pearson 2009 for details). Below we have two clauses that are temporally connected and it is the particle *no* that creates the link.

(28) Merina

Natory pst.at.sleep Rakoto Rakoto **no** *no* naneno pst.at.ring ny det telefaonina telephone 'Rakoto was sleeping when the phone rang.'

While *no* is not used as a complementizer (the most commonly used complementizer is *fa*), examples such as (28) above suggest that it is a particle that is part of the verbal extended projection. This makes it very different from *ang* in Tagalog, suggesting that the focus construction has a distinct underlying analysis. More specifically, I will argue that while DPs in Tagalog are focussed through a pseudo-cleft construction, they are focussed in Merina through a cleft construction.

### **3.3 Bezanozano**

Now we turn to Bezanozano, a more conservative dialect of Malagasy (see Ralalaoherivony et al. 2015 and Ranaivoson 2015 for more on Bezanozano). Not surprisingly, perhaps, it patterns more like Tagalog, which represents a more conservative form of Western Malayo-Polynesian sentence structure and morphology. Bezanozano has an interesting twist, however, that indicates a stage somewhere between Tagalog and Merina. We start with a basic sentence in Bezanozano that is not very different from what we have seen for Merina. The main difference is that the determiner, rather than being *ny*, is *i*.

(29) Bezanozano

Manasa prs.at.wash **i** det lambanay clothes.1pl.excl Rakoto Rakoto 'Rakoto is washing our clothes.'

Turning now to the focus construction, we see that instead of the particle *no*, we find *i*.

(30) Bezanozano

Rakoto Rakoto **i** det manasa prs.at.wash **i** det lambanay clothes.1pl.excl 'It is Rakoto who is washing our clothes.'

The similarity with Tagalog now is clear. The focussing particle is the same as the nominal functional category, most likely a determiner. What confirms this identity is the fact that the determiner *i* and the particle *i* show the same allomorphic variation, sometimes appearing as *ni* and sometimes as *i*. Given the fact that both Bezanozano and Tagalog use nominal functional categories in the focus constructions, we would expect distribution of these particles to work the same way in both languages. This is where the twist comes. In Tagalog, we saw that focussed PPs could not be followed by the nominal *ang*. We can see below, however, that focussed PPs in Bezanozano can optionally be followed by the nominal *i*.

(31) Bezanozano

Amin'i with.gen.det penin-janako pen-child.1sg.gen (**i**) det manorotra prs.at.write aho 1sg.nom 'It's with my child's pen that I am writing.'

Just as we were not surprised at the fact that in Tagalog *ang* could not follow PPs, we should be surprised that *i* can follow PPs in Bezanozano. One small consolation is that the *i* which follows the PP is not identical with the *i* that follows DPs in that the former is optional while the latter is not. Preliminary work on this dialect has not provided any more information on the distribution of this optional *i*, but given its distribution, I tentatively propose that obligatory *i* is a nominal functional head and optional *i* is a verbal functional head (though I have not yet found it in any other construction).

Important for the line of argumentation in this paper is that Bezanozano lies somewhere between Tagalog and Merina. Focussed DP constructions are pseudocleft constructions where the particle is actually a determiner signalling that the construction is still subject-final. But with the appearance of a homophonous particle that is not nominal in nature following the PP, there is a possibility of reanalyzing this particle as necessarily not being nominal (since it can follow a PP) allowing for a reanalysis of the DP-initial structures as clefts rather than pseudo-clefts. This would lead to a status such as that of Merina.

### **3.4 Betsimisaraka**

While Bezanozano is more conservative than Merina, I will argue that Betsimisaraka is more innovative. My work on this dialect is quite preliminary, but I have elicited the following constructions. Starting again with the basic sentence, we can see that it is quite similar to the other two dialects.

(32) Betsimisaraka manasa prs.at.wash lamba clothes Rakoto Rakoto 'Rakoto is washing clothes.'

Some differences start appearing, however, in the focus construction, precisely in the choice of the material that follows the focussed constituent. Below we first have a Merina example for comparison. This Merina construction shows that the same focus construction is used to form wh-questions. This example is followed by two examples from Betsimisaraka, one where a DP wh-word is in the focus position and one where a PP wh-word is in the focus position.

(33) a. Merina

Iza who **no** *no* manasa prs.at.wash lamba clothes 'Who is washing clothes?'


This preliminary work on Betsimisaraka shows that either nothing or one of two different elements can be found in the position following the sentence-initial constituent. The two elements that may appear are very dissimilar from the particles we find in Merina and Bezanozano. Further, they don't have a nominal function along the lines of the particle in Bezanozano, nor a clausal function along the lines of the particle in Merina. It turns out that they are adverbs that carry the parts of the meaning of a (pseudo)-cleft construction – where pseudo-clefts have a meaning of focus and of exhaustivity. The adverb *sy* in Betsimisaraka (*mihitsy* in Merina) means something like 'indeed' and the adverb *my* in Betsimisaraka (*ihany* in Merina) means 'only'. Technically, then, Betsimisaraka has no focus particle but when pressed to place something in this position, the choice is to put adverbs that lend the same flavour as a cleft. The position of these adverbs is not surprising as adverbs are often found together with the particle *no* in Merina. 6 Little words – big consequences

### (34) Betsimisaraka

a. tsy neg ny det olona people **mihitsy** indeed **no** *no* tokony should hiaro fut-at.protect an'Andriamanitra acc-God 'It isn't in fact the people who should protect God.' (from https://www.facebook.com/notes/ravonihanitra-lydia/sainampirenena-malagasy/10152939742301218/) b. 15%n'ny 15%-gen Malagasy Malagasy **ihany** only **no** *no* manana prs.at.have jiro electricity 'It is only 15% of Malagasy that have electricity.' (from http://www.sobikamada.com/index.php/vaovao/item/9918-

jirama-15-n%E2%80%99ny-malagasy-ihany-no-manana-jiro.html)

Now we have a dialect that has no particle following the focussed phrase, basically resulting in SVO. Work needs to be done to determine in what situations this structure can be used, and with what restrictions. In other words, it remains to be determined what information a language learner will be exposed to that would indicate that this is not the basic word order of Betsimisaraka. But it is clear that the indications that this is **not** a basic word order become less and less accessible as we move from Tagalog to Bezanozano to Merina to Betsimisaraka and it all turns on the existence and function of the focussing particle.

## **4 Summary**

Moving then from Tagalog, to Bezanozano, to Merina, to Betsimisaraka, we see a slow chipping away at the information given to the language learner by the focus particle. I am assuming that in all of these languages/dialects, the focussed XP is within a predicate headed by an unpronounced copula. I gave the tests for this for Merina in (5) and (6). In these examples, negation and a raising verb respectively precede the focussed element, thereby indicating the presence of a verbal element.

Turning now to the particle that follows the focussed XP, we have seen that in Tagalog, the particle *ang* clearly marks the left edge of a nominal indicating that the material following the focussed element is a DP and the subject of the clause.

	- a. **Ang** nom **babae** woman ang nom *bumili* at.buy *ng* acc *bigas* rice '[DP The ⟨one who⟩ *bought rice* ] [VP ⟨is⟩ **the woman** ]' b. [VP ⟨cop⟩ **DP** ] Predicate [DP *ang V O* ] Subject

The Tagalog focussed PP, in contrast, is found in a cleft construction. There is no *ang* to indicate a nominal phrase, therefore the material following the focussed constituent will not be interpreted as the subject of the clause. The subject of the clause, then, is an unpronounced expletive.

	- a. **Sa** prep **palengke** market bumili at.buy ng acc bigas rice ang nom babae woman '[VP ⟨was⟩ **at the market** ⟨that⟩ the woman bought rice. ] [DP ⟨It⟩ ]'
	- b. [VP ⟨cop⟩ PP Predicate [CP V O S ] ] ⟨Expletive⟩ Subject

Bezanozano is similar to Tagalog in that it uses a clear nominal functional category for the DP focussed construction. This nominal functional category gives the language learner a clear indication that the language is VOS since the predicate, which contains the unpronounced copula and the focussed DP, is followed by the nominal phrase indicated by nominal functional category *i*.

	- a. **Rakoto** Rakoto **i** det *manasa* prs.at.wash *i* det *lambanay* clothes.1pl.excl '[DP The ⟨one who⟩ *is washing our clothes* ] [VP ⟨is⟩ **Rakoto** ]' b. [VP ⟨cop⟩ **DP** ] Predicate [DP *i V O* ] Subject

The way the Bezanozano differs from Tagalog, however, is that there is a particle that is used optionally within the PP focussed construction. For now I'm going to assume that the fact that it is optional while the one that is used in the DP focussed construction indicates a structural difference of some type that allows this construction to be a cleft rather than a pseudo-cleft.

(38) Bezanozano: Focussed PP = Cleft (there is an optional *i*)


What is interesting is that this is the same particle that is used for the DP focussed construction. When it is not used, then, it falls into the Tagalog pattern where there is a particle in the DP focussed construction and no particle in the PP focussed construction. When it is used, it falls into the Merina pattern which uses the same particle for both the DP and the PP focussed construction. The thought is that these mixed messages allowed for reanalysis that eventually leads to the Merina pattern.

Merina uses the same particle for both the DP and the PP focussed constructions and this particle is used elsewhere to link clauses. This suggests that the particle is part of the verbal extended projection, and both types of the focus constructions are clefts. Since the expletive subject of a cleft is not pronounced, with these constructions, there are fewer signals as to the VOS order. In the DP focussed construction, since the surface order is S *no* VO, and since the *no* is not a nominal marker, it could be susceptible for reanalysis.

(39) Merina: Focussed DP = Cleft (there is a clausal marker *no*)


### Lisa Travis

In the last stage, we see that the identifying focus particle is dropped completely. Adverbs can appear in this position, but these adverbs can also appear in the Merina and Bezanozano construction. So now without any particle, a simple SVO order surfaces.

(41) Betsimisaraka Rakoto Rakoto manasa prs.at.wash lamba. clothes 'It is Rakoto who is washing clothes.'

The task remains, however, to determine the status of this order in the language. We know that it can be given the cleft interpretation. We also know that it co-exists with the VOS word order. Whether or not the transition to SVO can be argued to be complete, it is at least imaginable how it can happen. It is also clear that the change turns on the reanalysis of small functional words that play central structural roles.

# **5 Conclusion**

The purpose of this paper was to show first that small surface differences in closely related languages can point to large underlying differences. It also shows how functional words are signposts to structure and that the multiple roles that they play both within the extended projection of one category and across different categorial projections can increase the flexibility of structures as well as increase the possibilities of reanalysis.

# **Abbreviations**


6 Little words – big consequences

## **Acknowledgements**

This paper benefitted from funding from SSHRC grant 435-2016-1331 (PI: Lisa de-Mena Travis) and SSHRC grant 410-2011-0977 (PI: Ileana Paul), as well as crucial input from Baholisoa Simone Ralalaoherivony and Jeannot Fils Ranaivoson. Further, I thank Ian Roberts. It is hard to separate the study of parameters from his work over the years. Any serious study of parameters includes at least Biberauer & Roberts (2016), Roberts & Holmberg (2005), Roberts (2014), as well as his work in Roberts (1993). I also thank him for contributing to, as well as challenging, my understanding of too many areas of syntax to list – head-movement, diachrony, macro- and micro-parameters, clitics, V-movement, VP-movement, pro-drop, to mention a few. He has always been a model of research breadth and depth as well as research integrity.

# **References**


### Lisa Travis


# **Chapter 7**

# **Heads and history**

Nigel Vincent The University of Manchester

# Kersti Börjars

St Catherine's College, University of Oxford

This paper considers and compares the status of the concept of head within different grammatical frameworks (Minimalism, LFG and HPSG) and its relevance to our understanding of the mechanisms of change involved in grammaticalization. Our data is drawn from the developments of lexical prepositions into grammatical prepositions and complementisers in Romance and Germanic. We argue in favour of a non-derivational approach and in particular against accounts in which all developments are mediated through a chain of functional heads of the kind deployed in cartography and nanosyntax.

# **1 Introduction**

Heads come in two kinds: lexical and functional. While the former are treated in a largely uniform way across theoretical frameworks, with the latter things are different. Functional heads have been reified as a core theoretical construct within Minimalism, where they abound particularly, but not exclusively, in the cartographic version, but have much less presence in a non-derivational framework like Lexical-Functional Grammar (LFG) and an even more reduced role in Head-Driven Phrase Structure Grammar (HPSG). The difference between the two kinds of heads also plays out in the diachronic domain. Nouns, verbs and adjectives often have consistent historical trajectories over centuries. Many of the nouns of modern English, for example, were also nouns a millennium ago in Old English even if they have undergone extensive phonological and semantic change in the

Nigel Vincent & Kersti Börjars. 2020. Heads and history. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 133–157. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972840

### Nigel Vincent & Kersti Börjars

meantime. The diachronic profiles of items that realise functional heads are very different, since, typically, they start out as full lexical words before developing into a grammatical item. English *will* is a good case in point, having begun life as a lexical verb meaning 'want' before becoming the temporal/modal marker that it is today and, in some approaches, being assigned a structural position under a node such as T or I. The key question then becomes: how do diachrony and synchrony interact, and in particular how is the historical relation between lexical and functional categories treated, in different grammatical frameworks? In the present paper, we seek to compare and contrast LFG, HPSG and Minimalism as models of (morpho)syntactic change. Our chosen dataset is the linked evolution of prepositions and complementisers in a range of Romance and Germanic languages, but we hope and believe that the conclusions we will draw on the basis of this evidence will extend both to other categories and to other languages and families.

# **2 Grammaticalisation and category change**

The phenomena that we will examine in this paper fall under the general heading of grammaticalisation, classically defined by Meillet (1912: 131) as "l'attribution du caractère grammatical à un mot jadis autonome [the attribution of a grammatical value to a formerly autonomous word]" and by Kuryłowicz (1965: 69) as "the increase of the range of a morpheme advancing from a lexical to a grammatical or from a less to a more grammatical status". We should be clear at the outset that such definitions seek to identify a phenomenon or a mechanism of change. Grammaticalisation is a descriptive label and not a theoretical construct, *pace* the locution "grammaticalisation theory" that is to be found from time to time in the literature, for instance in the positive reference by Haspelmath (1989: 318) to the "explanatory standards of grammaticalization theory". There are two properties which characterise such changes: the first is the fact that they recur within the histories of unrelated languages. In our introduction, for example, we cited the case of the English future auxiliary *will*, which derives from the Old English *willan* 'want'. A similar shift is to be seen in the use of the Romanian verb *a vrea*, etymologically the reflex of Latin *velle* 'want', to signal futurity, in similar uses of the 'want' verb elsewhere in the Balkans (Albanian, Croatian, Greek), in the Swahili future prefix -*ta*- originating in the verb *taka* 'want', and in parallel developments in a number of other languages (Heine & Kuteva 2002). The second property is the unidirectionality – or at least overwhelming asymmetry in direction – of such changes; thus, we find many instances of volition verbs be-

### 7 Heads and history

coming future tense markers, but none of futures turning into verbs of volition (see Börjars & Vincent 2011 for further discussion and exemplification).

To the claimed existence of grammaticalisation there have been two broad classes of response. One is to deny its place as a special and separately identifiable category among the general processes of reanalysis that characterise morphosyntactic change (see amongst others Campbell 2001; Joseph 2001; Newmeyer 2001). The alternative is to accept that grammaticalisation exists and to seek to model it in theoretical terms. This, in very different ways, is what has been done by Heine et al. (1991), Roberts & Roussou (2003), van Gelderen (2011) and Traugott & Trousdale (2013), and it is within this latter class of approaches that the present paper also falls. A central issue then becomes the nature of the theoretical constructs that are assumed. Roberts & Roussou (2003), for example, operate within a framework which permits synchronic analyses involving movement upwards from a lexical head to a functional head but not downwards from functional to lexical – a principle of Universal Grammar (UG) which appears to mimic, and has been argued to explain, the directionality of change from lexical to grammatical but not vice versa implicit in Meillet's and Kuryłowicz's definitions. LFG and HPSG, by contrast, do not include movement within their theoretical inventories.

## **3 Prepositions and complementisers in diachrony**

When it comes to categories and category change, prepositions are distinctive in two complementary, but as we will suggest connected, ways. From a synchronic point of view they appear to straddle the boundary between lexical items with their own semantic content – as in contrasting pairs such as *on* and *off*, *under* and *over*, *to* and *from* – and functional items such as the various ways of marking arguments of adjectives and verbs: *proud of, convince someone of*, *keen on, rely on*, *similar to, give to* or *different from, differ from*. (For more discussion in relation to a variety of languages, see the papers in Saint-Dizier 2006; Asbury et al. 2008; François et al. 2009; Cinque & Rizzi 2010.) At the same time there is also evidence that they all behave in ways akin to other functional items in acquisitional and pathological contexts. In this connection, the results of Froud's (2001) study of an aphasic patient are particularly striking and have led some to conclude that all prepositions should be treated as functional heads. A different but related contrast is that between open and closed classes. Many languages are like English in having a group of typically monosyllabic items that have high textual frequencies, plus a more open class of polysyllabic and syntactically complex items such as *across*, *behind*, *against*, *in front of*, *by virtue of* and the like which share the distribution of, and may alternate with, the monosyllabic items.

### Nigel Vincent & Kersti Börjars

Diachronic considerations complicate the picture even further: polysyllables may shorten into monosyllables as a result of sound change (*over* > *o'er* in some dialects); simple and complex forms may contrast (*for* vs *against*, *behind* vs *in front of* ) and once independent forms may fuse or lose syntactic and semantic content (*because* < *by cause*, *beside* < *by side*, *in light of*, *by virtue of* ). In the historical context, prepositions are also remarkable because of the sheer variety of their etymological origins. Whereas temporal and aspectual markers are, for the most part, derived from independent verbs, prepositions can emerge from a variety of categorial sources. Thus, among the items that we will consider in more detail below, the Swedish and Danish prepositions *till* and *til* 'to, towards', are descended from a noun meaning 'goal' and are cognate with the German noun *Ziel* 'goal, target'. As such, in origin they were accompanied by nouns in the genitive as the case which typically marks nominal dependents. A trace of this can be seen in the final -*s* which survives in such fixed expressions as Danish *til sengs* 'to bed' and Swedish *till sjöss* 'at sea'. A similar effect is to be seen with the Latin items *causa* 'because of' and *gratia* 'thanks to', which have clear nominal origins and are the only Latin adpositions to govern the genitive case. And with prepositions too, we find recurrent patterns developing independently within different languages. For example, the items *hos* 'at, with' in Swedish and Danish and French *chez* 'at, with' are both descended from nouns meaning 'house, household' (Plank 2015), and are often contrasted with the Swedish/Danish noun *hus* and the fact that Latin *casa* 'hut' has stayed as the usual word for 'house, home' in Italian and Spanish.

In other instances, prepositions may stem from independent adverbial particles which acted as specifiers for particular case forms. This is particularly relevant for the items on which we focus below. Thus, Latin *ad* 'to, towards', and the infinitival markers in Swedish *att* and Danish *at*, all descend from a Proto-Indo-European particle \*ad 'at, near', hence the fact that the Latin preposition takes the accusative case, in origin used in a directional sense. By contrast Latin *de* comes from a particle meaning 'down, away from' and so occurs with the ablative, where the latter fuses earlier distinct locative and ablative cases (Vincent 1999; 2017).

In addition to nouns, particles and reduced complex structures of the *behind* type, prepositions may also derive from a range of non-finite verb forms, as with French *pendant* 'during' < Latin *pendentem* 'hanging', pres participle of *pendeo*, English *including*, Italian *presso* 'near' < Latin *prehensus*, past participle of *prehendere* 'take', Danish *blandt* 'among' < *blandet*, past participle of *blande* 'mix', Sicilian *agghiri* 'towards' < *ad jiri* 'to go.inf'. Similar in function to participles

### 7 Heads and history

and also possible etyma for prepositions are adjectives as in Italian *vicino* 'near' < Latin *vicinum*, or English *near*.

Complementisers exhibit a similar diversity of etymological sources including demonstrative pronouns as with English *that*, Swedish finite *att* and Estonian *et*, interrogative/relative pronouns as with French *que* (< Latin *quid* 'what') and Greek *oti*; nouns as for instance Korean *kεs* < 'thing' used with finite clauses; and verbs, especially verbs of saying, e.g. Yoruba *kpé*, Uzbek *deb* and Turkish *diye* (Kehayov & Boye 2016: 870–874). As we shall see in what follows, they may also evolve from prepositions as in the case of French *à* and be linked to infinitives, and corresponding patterns elsewhere in Romance, Swedish infinitival *att* and Danish *at*, English *to* and German *zu*, Irish *go* and Basque -*ela*; with the exception of *de* and its cognates, all of these are derived from allative prepositions. Within the literature such patterns have led some scholars to postulate an intermediate category of "prepositional complementiser" (Borsley 1986; 2001; Kayne 1999 and see §7 below). In this context, too, the directionality property is evident in that, while a preposition may over time acquire complementizing functions, the reverse development is not attested.

## **4 Heads and diachrony across frameworks**

The evidence of diachrony has figured very differently within the frameworks under consideration here. The fact of language change and its implications for general linguistic theory have figured as core issues within the Chomskyan tradition ever since the seminal work of Lightfoot (1979). By contrast, there has to date been relatively little work from a diachronic perspective within LFG – but see the contributions to Butt & King (2001) for some examples and Börjars & Vincent (2017) for a general overview – and virtually nothing within HPSG. And yet in different ways both these last-mentioned approaches have much to offer historical linguists. In the first place, the absence of an assumption of an innate UG makes them easier to reconcile with the historical datasets derived from usage-based approaches without giving up on the commitment to formal modelling.<sup>1</sup> Secondly, their less rigid approach to phrase structure and their readiness

<sup>1</sup>As one of our reviewers reminds us, there is no inherent incompatibility between a belief in the existence of an innate UG and the assumptions of LFG and HPSG. And there are also a range of views within the Minimalist community as to what exactly is to be ascribed to UG. However, the fact remains that, as far as we are aware, no variant of Minimalism abandons UG in its entirety whereas within the HPSG and LFG communities there is general agreement that grammatical descriptions and explanations do not require the postulation of any innate components of language.

### Nigel Vincent & Kersti Börjars

to recognise other dimensions of linguistic information makes them able more readily to accommodate linguistic diversity, including that which is the result of change (Evans & Levinson 2009: 475).

Let us begin then by comparing the types of category that are available within the different frameworks, with an eye particularly to the differences between the sub-types of non-lexical category since it is at that point that they most obviously diverge from each other. In this respect, Minimalism is in principle the most straightforward, since it presupposes a simple contrast between lexical heads (at least N, V, A; Baker 2003: 303–325) and functional heads. Constituency trees are always binary and consist of a head (lexical or functional) plus its complement; lexical heads are always dominated by one or more functional projections and typically move from a lower base-generated position to a higher functional one in the course of a derivation. The system is thus apparently strictly constrained, but in fact the restrictions in one part of the tree lead to considerable analytical freedom elsewhere, since the inventory of functional heads is large and seemingly unconstrained, particularly in the cartographic variant of the approach. And while some such heads have names at least which suggest a semantic basis – T(ense), Mod(al), D(et), etc. – others seem to be there only to facilitate the necessary movements or to provide an intermediate location for arguments but which do not have any overt phonological exponence, as with so-called "small" vP and nP. Moreover, all heads can in principle be empty or be occupied by silent items, so the possible analytical space is in practice quite unconstrained.<sup>2</sup>

When it comes to LFG, the opposite state of affairs obtains. More basic types of category are available and there are no constraints barring non-binary or nonheaded configurations. On the other hand the inventory of functional heads deployed is generally assumed to be very limited and null heads are wherever possible avoided. Table 7.1 sets out in tabular form the categories recognised within this framework.

In the most constrained versions of LFG, a functional category is postulated only when a feature comes to be associated with a structural position within a particular language, but there is no expectation that such categories are of universal validity (Kroeger 1993: 6–7; Börjars et al. 1999). Much of the work that is done by such categories in a model like Minimalism – for example in the domains of tense and modality – is instead handled within the f-structure (where "f- " stands for functional in a different sense!), which is parallel to the c(onstituent) structure. The functional categories most commonly assumed are C, I and D, and

<sup>2</sup>A more constrained approach to categorial structure within a derivational framework is the Universal Spine explored in Wiltschko (2014). Lack of space forbids further consideration of this approach in the present context but for some discussion see Vincent (2018).

### 7 Heads and history


Table 7.1: Types of category in LFG

on such a view the natural diachronic trajectory is for a structure like DP to gradually emerge or "grow"; definiteness first becomes associated with a category D and in due course with a particular structural position and hence as heading a DP where formerly there was an autonomous NP (Börjars et al. 2016). A different kind of construct within LFG is what, following Toivonen (2003), have come to be known as non-projecting words (notated Xˆ). Items in this class are of category X 0 but do not project to X′ or XP, they are marked as such in the lexicon and are head-adjoined to an associated and projecting X<sup>0</sup> . Toivonen's (2003) case study focuses on Swedish particles such as *ihjäl* 'to death' in the string *slå ihjäl* 'kill', lit. 'beat to death', where *slå* is of the category V<sup>0</sup> , as is the whole string, but where *ihjäl* is a non-projecting P. As she demonstrates, the items that fall within the class of particles belong to a number of different categories – verbal, nominal, adjectival and prepositional – but what they have in common is that they adjoin to another item, to which in effect they cede head status. What Toivonen does not observe, but which is striking once the diachronic perspective is adopted, is that most if not all the items she categorises as non-projecting in this sense are themselves historically derived from full projecting categories or even phrases. The form *ihjäl*, for example, is a frozen version of the original PP *i hel* 'in the land of the dead'.<sup>3</sup>

When we come to HPSG, beside full lexical heads stands the category of transparent head (Flickinger 2008), that is to say an item which determines the overall category of the phrase it heads but does not add any semantic content (in the sense defined below) of its own. A case in point is the English complementiser *that*, which heads and defines a CP, but does not contribute to the semantic representation of the clause of which it is a part. Such a concept is close to if not

<sup>3</sup>A reviewer points out that some recent work within Minimalism has adopted a similar notion of non-projecting words as a way of dealing with particles (see for example Biberauer 2017).

### Nigel Vincent & Kersti Börjars

identical with the status the same item would have in an LFG or Minimalist account. More radical, however, was the suggestion by Pollard & Sag (1994: 44–46) that such items belong to a separate category of "markers". In their account, a marker is "a word that is 'functional' or 'grammatical' as opposed to substantive, in the sense that its semantic content is purely logical in nature (perhaps even vacuous)". Crucially, a marker is not a head. This concept, which conforms in many respects to traditional intuitions about such items, is not, however, the preferred option. Rather, there has developed within recent HPSG work the notion of a "weak" head, defined by Abeillé et al. (2006: 156) as "a lexical head that shares its syntactic category and other head information with its complement". Table 7.2 below summarises the various notions of head within HPSG, and Table 7.3 compares the inventory of category types and their properties within LFG and HPSG.

Table 7.2: Types of category in HPSG



With these concepts and categories in mind we can now ask what kinds of diachronic trajectories are predicted within the various systems and how these stack up against the empirical evidence.

7 Heads and history

## **5 Prepositions in the nominal domain**

We start with the example of Swedish *till* and compare the way it can be analysed within the three frameworks under consideration in this paper. As noted above, this item begins life as a noun, so the categorial shift in the first instance is N > P. However, as the examples in (1) demonstrate, in the modern language it has acquired a range of functions.

	- a. Oscar Oscar tog take.pst tåget train.def till to Stockholm. Stockholm 'Oscar took the train to Stockholm.'
	- b. Oscar Oscar gav give.pst boken book.def till to läraren. teacher.def 'Oscar gave the book to the teacher.'
	- c. Oscar Oscar sparkade kick.pst till to däcket. tyre.def 'Oscar gave the tyre a kick.'

In (1a), we have the directional sense consistent with its etymological source in a noun meaning 'goal', in (1b) it marks a grammatical relation, and in (1c) it behaves as an adverbial particle. Within LFG, these three uses can be modelled as in (2). Here (2a) simply states that *till* is a full preposition with its own semantic content expressed via the pred feature and that it subcategorises for an item having the function obj(ect). The representation in (2b), by contrast, indicates its use to mark the grammatical relation of an oblique recipient, and (2c) is an example of a non-projecting word serving as a marker of dynamic aspect (Toivonen 2003: 142).

(2) a. *till* P (*f* pred) = 'till <obj>' b. *till* P (*f* pcase) = obl*Recipient* c. *till* Pˆ (*f* aspect telic) = − (*f* aspect dynamic) = + (*f* aspect durative) = −

Neither of the developments in (2b) and (2c), which are logically independent of each other, are possible until after the use of *till* as a preposition with a full semantics has emerged, so the diachronic sequence is N > P > Pobl/Pˆ. In other words, on this view, once we reach the P stage the change is not reflected in

### Nigel Vincent & Kersti Börjars

the categorial head status of the item but in the kinds of f-structure that are associated with it and its projectability.

A complaint that is sometimes made about formal models by proponents of grammaticalisation theory is that these formal models cannot capture what is described as the "gradualness" of change because all they have at their disposal is a set of discrete categories (see for instance Haspelmath 1989: 330). The gradualness is more appropriately described as change in small steps, as argued by Roberts (2010). The analyses which we describe here do exactly that; they provide ways of capturing those stages between the prototypical categories that are characteristic of grammaticalisation, though as we will see, the steps here are described in functional and/or feature terms rather than through the use of a larger inventory of syntactic heads in the way that is characteristic of cartographic and nanosyntactic approaches.<sup>4</sup>

Within HPSG, the full semantic use, or what Pollard & Sag (1994) call a "predicative preposition", is modelled as in (3).<sup>5</sup>


That is to say, it is a full independent head of the type *prep-word* with an NP complement, where the cont feature is defined in terms of the semantic concepts of figure and ground (Tseng 2000; 2002). The grammatical use is also of type *prep-word*, but in contrast to the allative preposition, it has no independent content value; the value for the whole phrase is instead derived from that of the NP complement (this use is referred to as "non-predicative" by Pollard & Sag 1994, and as "transparent" by Flickinger 2008, whereas Abeillé et al. 2006 describe it as a full head with "weak" semantics). This is illustrated in (4), where the values for the two cont features are shared.

<sup>4</sup> For further discussion of the gradualness question in the verbal domain, see Börjars & Vincent (2019).

<sup>5</sup>The authors we refer to here use slightly different versions of the HPSG formalism without this affecting the general principles of the solutions. Our aim here has been to illustrate the points made by the different authors in a unified way rather than to side with any one of them on detail.

7 Heads and history

$$\begin{array}{c} \text{(4)} \quad \begin{bmatrix} \textit{prep-word} \\ \textit{CAT} \\ \textit{CAT} \\ \textit{COMPS} \\ \textit{COMPS} \\ \textit{COMPS} \end{bmatrix} \begin{array}{c} \textit{prep} \\ \textit{MARING} \\ \textit{COMPS} \\ \textit{CNN} \\ \textit{DI} \end{array} \left( \begin{bmatrix} \textit{CAT} & \begin{bmatrix} \textit{HEAD} & \textit{nom} \\ \textit{DECT} & \begin{bmatrix} \end{bmatrix} \end{bmatrix} \right) \right) \end{array} \right)$$

In that sense, the preposition is semantically "transparent" but preserves its head status and the constituent is accordingly still a PP. As Table 7.3 illustrates, in HPSG, there is also a third analysis possible, namely that of a "weak head". This is the analysis proposed for the use of the preposition in French illustrated in (5) (Abeillé et al. 2006: 150), but it is not clear whether it would also be applicable to the Swedish example in (1c). The relevant feature matrix is provided in (6).

	- a. *Des* de.def.pl bijoux jewel.pl ont have.prs.3pl été be.pst.ptcp volés. steal.pst.ptcp.pl 'Jewels were stolen.'
	- b. *De* de sortir go out.inf un a peu little plus more te you ferait do.cond.3sg du de.def.m.sg bien. good.sg 'Getting out a bit more would do you good.'

$$\text{(6)} \quad \begin{bmatrix} \text{weak-head} \\ \begin{bmatrix} \text{HEAD} & \boxed{} \\ \text{MARING} & de \\ \text{COMPS} & \left< \begin{bmatrix} \text{CAT} & \boxed{} \text{HEAD} & noun \lor \text{verb} \end{bmatrix} \right> \end{bmatrix} \end{bmatrix}$$

In (6), *de(s)* is no longer of type *prep-word*, but of a separate type *weak-head*. Characteristic of this type is that it shares the value for its head feature with its complement, which means that these features, such asinf on the VP complement in (5b), are visible for external selection. This in turn means that it transmits nominal properties if attached to a noun and verbal properties if attached to a verb. Such prepositions are dubbed "minor" by Van Eynde (2004) and "non-oblique" by Abeillé et al. (2006). This is also the analysis Tseng (2002) proposes for the complementiser*that* in English. The role of weak heads within the overall descriptive apparatus of HPSG is similar to that of non-projecting words in LFG in that they do not project, though as shown in Table 7.3, they differ with respect to semantic content. Both these systems are thus significantly different from Minimalism,

### Nigel Vincent & Kersti Börjars

where heads must always project. In diachronic terms, the development is then captured in HPSG from N to "full" P head and thence to either a transparent or a weak head or indeed, as here, to both.

The examples in (1) instantiate a well-known difficulty in synchronic descriptions of prepositions, namely how to model the formal identity beside the functional differences, and accounts such as those set out in (2), (3) and (4) achieve this goal by retaining the syntactic category P while associating it with different sets of morphosyntactic and semantic content. An alternative way to proceed is to postulate a separate category for the grammatical marker, in particular the functional head K, which licenses the associated NP or DP. K in turn can be realised either as a case-inflection or as a preposition. This solution has been strongly advocated in recent work within the nanosyntactic variant of Minimalism – see for example Svenonius (2008) and Roy & Svenonius (2009). Such an approach offers a way to capture the functional equivalence of *till* in an example like (1b) and the dative case in the equivalent in a language like Latin, through the structural difference between the preposition and the case marker is not as straightforwardly captured. In the present context, it is to be noted that this case-marking function of prepositions is itself the outcome of historical change. Items like Swedish *till*, English *to* and French *à* start out as semantically full expressions of direction and acquire this secondary role over time. The same goes for prepositions like English *of* and French *de* in their role as marking the argument of nominal head in expressions like *the king of England* or *le roi de France*. Within Minimalism such shifts can be seen as involving a change from P to K, whereas once again, in HPSG and LFG, the change is in the information associated with the argument of P rather than in the category itself.<sup>6</sup>

## **6 Prepositions in the verbal domain**

Prepositional items may also develop in the direction of taking verbal complements. In this section we examine three contrasting circumstances within Germanic and one further one in Romance. The Germanic developments are summarised in (7).

	- b. German: *zu* derives from the same etymon as English *to* (< PIE \*do 'to', 'toward') and also has both prepositional and infinitival functions.

<sup>6</sup> For some discussion of the use of K in the analysis of complex prepositions like *in spite of*, Danish *på grund af* 'because', lit. 'on ground of' and French *à côté de* 'beside', lit. 'at side of', see Roy & Svenonius (2009) and Vincent (in press).

### 7 Heads and history

c. Swedish and Danish: the infinitive marker *att*/*at* also derives from a PIE locative particle \*ad 'to' but in this instance, unlike English and German, there is no homophony between infinitive marker and preposition, either because, as with Swedish *åt*, the preposition has an independent phonetic development or because, as in Danish, the prepositional usage does not survive.

All these developments are instances of the cross-linguistically recurrent diachronic cline (8) identified in Haspelmath (1989).

(8) allative preposition > purposive marker > infinitival marker

At the same time, there are significant structural differences between the individual Germanic languages under consideration here. German *zu* cannot be separated from the verb and hence the grammaticality difference between (9a) and (9b).

	- a. Er he hat have.pst versprochen, promise.pst.ptcp bald soon zu zu kommen. come.inf 'He had promised to come soon.'
	- b. \* Er hat versprochen, zu bald kommen.

Indeed *zu* can, in certain circumstances, be part of the verb, as in the infinitive *aufzustehen* 'to stand up' beside the finite *ich stehe auf* 'I stand up'. In the words of Haspelmath (1989: 296). "Modern German *zu* is probably a bound prefix although the spelling treats it as a non-bound element" (compare Giusti 1991 for a similar conclusion).

In English, some separability is permitted, as in the Star Trek introduction: *To boldly go where no man has gone before* or in examples like (10), which are frequent despite the prescriptive prohibition of the split infinitive, not least because there is no obvious alternative to placing the adverb between *to* and *understand*.

(10) To really understand the situation you need to be an experienced politician.

The grammatical category to be assigned to English *to* is more controversial. Pullum (1982) argues that it behaves like an auxiliary, and Koster & May (1982) place it in I on the grounds that it expresses the feature value [−finite] and that finiteness in English is, in general, a property of items that fall under I. As Falk (2001) observes, this conclusion only follows if functional properties and categorial status have to be aligned, as indeed they do in the GB framework adopted by

### Nigel Vincent & Kersti Börjars

Koster & May, but Falk is operating within LFG and, having separated function and category, concludes that *to* is in C. We will not seek to resolve the matter here; it suffices for us to note that all are agreed that its status in this construction is no longer prepositional. Moreover, it is clear that the distribution of *to* in earlier stages of the language implies a different status from that which it has in the modern language (van Gelderen 1998). Haspelmath (1989) adduces similar evidence for the separation of *zu* from V in earlier stages of German. Putting this evidence together, therefore, we can postulate a diachronic trajectory from P to an intermediate functional head such as C or I followed by incorporation under V.

When we come to North Germanic, however, things look rather different. Not only is the etymological source of the infinitival marker different but so is its distribution (Platzack 1986; Beukema & den Dikken 1989; Christensen 2007). The examples in (11) show that Swedish *att*, for example, can be separated from the verb even by whole phrases and clauses.

	- a. Hon she njöt enjoy.pst av of *att* att efter after många many år year åter again *känna* feel.inf fast solid mark ground under fötterna.

under foot.pl.def

'She enjoyed feeling solid ground under her feet again after many years.'

b. *Att* att fastän although hon she bara only kunde could ha ha.inf stängt close.pst.ptcp dörren door.def efter after sig refl *stanna* stay.inf och and *lyssna* listen.inf på on vad what han he hade have.pst att att säga say.inf visade show.pst sig refl vara be.inf ett a dåligt poor beslut. decision 'To stay and listen to what he had to say, even though she could have simply closed the door behind her, turned out to have been a poor decision.'

It is also the case that, in Swedish, negation and negated objects obligatorily occur between *att* and the verb as in (12).

	- a. Hon she gjorde do.pst sitt refl.poss bästa best för for (\*inte) att att inte not somna fall asleep.inf (\*inte). 'She did her best not to fall asleep.'

7 Heads and history

b. Känslan feeling.def av of att att ingenting nothing kunna be able.inf göra do.inf (\*ingenting) skrämmer frighten.prs mig. me 'The feeling of not being able to do anything about it frightens me.'

Given this distribution it is natural to see Swedish infinitival *att* and the corresponding forms in other Scandinavian languages as occupying the complementiser position and hence as instantiating a change from P to C. At the same time, it is of interest that these languages also display a separate form, usually spelled the same but pronounced differently, that is, the complementiser for finite clauses as in (13) (examples (13b) and (13c) taken from Nordström & Boye 2016).

(13) a. Swedish

Olle Olle vet know.prs att comp han he får is allowed.prs komma come.inf på on festen. party.def 'Olle knows that he is allowed to come to the party.'

b. Danish

Hun she tvivler doubt.prs på on at comp han he er be.prs der. there 'She doubts that he is there.'

c. Faroese

Hon she fortelur tell.prs at comp hann he fer go.prs at at koma come.inf i in dag. day 'She says that he is going to come today.'

Thus, in (13c) for example, the first occurrence of *at* is a finite complementiser derived from a demonstrative pronoun and cognate with English *that*, while the second occurrence in the future periphrasis *fer at koma* is cognate with Swedish infinitival *att* and has a prepositional source.

What we have seen in this section, then, is how prepositional items, which are traditionally defined as taking nominal complements may also over time come to be associated with verbal complements. We now turn now to consider the consequences of this alternative pattern of development.

# **7 From the nominal to the verbal domain**

We have characterised the changes in the previous section in terms of a historical shift from P to C and/or I, and this is indeed what would have to be said within

### Nigel Vincent & Kersti Börjars

both Minimalism and LFG. However, the HPSG concept of "weak head" will allow us to generalise across all the developments by simply saying that the original full head status of the prepositions in question weakens over time. Recall that the definition of a weak head is one that contributes only the value for the marking feature but yields its head value, that is, its syntactic category, to the item with which it combines. Thus, if it combines with a verb, as with German *zu*, its external distribution is determined by that verb; if it is an independent constituent, as is the claim made in assigning an item the status of I or C, then it will pattern with that larger constituent, be it finite or non-finite as the context requires. We will consider now some evidence from Romance where the items in question do indeed yield their distributional power to the item with which they co-occur but, unlike the Germanic examples we have been considering, they nonetheless retain their own value as prepositions. In other terminology, they are prepositional complementisers (Kayne 1999; Borsley 2001).

Compare the two French examples in (14) as discussed by Abeillé et al. (2006).

	- a. Il he est be.prs.3sg allé go.pst.ptcp à to la the gare. station 'He went to the station.'
	- b. Il he m'a me-have.prs.3sg invité invite.pst.ptcp à to venir come.inf demain. tomorrow 'He invited me to come tomorrow.'

(14a) is a clear case of the full lexical preposition *à* with the directional meaning 'to', akin therefore to Swedish *till* in (1a). (14b), on the other hand, is another instance of an allative preposition coming to introduce an infinitival complement of a higher verb. The difference in the Romance case is that the pattern with *à* (and its cognates in the other languages) exists and develops side by side with another such pattern using the preposition *de* 'of, from' as in the examples in (15).

	- a. Il he vient come.prs.3sg de de Paris. Paris 'He comes from Paris.'
	- b. Il he a have.prs.3sg décidé decide.pst.ptcp de de venir come.inf demain. tomorrow 'He has decided to come tomorrow.'

### 7 Heads and history

Abeillé et al. represent the lexical prepositions in (14a) and (15a) in much the same way as they would be represented in other frameworks: they are of the type *prep-word* and take an N-headed complement. The difference between frameworks is rather to be seen in the treatment of the grammaticalised use of the preposition to introduce an infinitive. For Abeillé et al., the weak heads *à* and *de* in (14b) and (15b) are heads in the sense that they select a complement, viz. the infinitival VP *venir demain*, and they add a value for the feature marking to the phrases they head, but they remain weak in the sense that they inherit the valence list of the complement. This last point is crucial since the matrix verb, on the one hand, determines the form of the complement – *inviter* in (14b) selects an infinitive marked with *à* and *décider* in (15b) one with *de* – and on the other contracts argument relations via control, or in other circumstances raising, with the embedded infinitive.<sup>7</sup>

At first sight it might appear that this is no different from saying that the items in question have become functional heads. However, Abeillé et al. (2006: note 12) are at pains to stress that, in their words, "weak heads differ from functional heads in LFG or GB". In particular, a weak head is not a new type of category. As they go on to say: "Although a weak head's category is underspecified in the lexicon, in any given syntactic context, it has a completely ordinary syntactic category (e.g. N or V). It is important to emphasise that when a weak head inherits a value of type verb or noun, it does not actually "become" a verb or a noun (i.e., a lexical object of type *noun-word* or *verb-word*)." Rather, in our present case, it maintains its status as a *prep-word*, which it shares with the full lexical preposition. In other words, the change is not a matter of grammatical category but of the manner in which elements of this kind integrate with the other parts of the sentence.<sup>8</sup>

Within LFG, a framework in which, as we have said, the distinction between category and function is built into the basic architecture via the distinction between f-structure and c-structure, an example like (14a) can be treated in the same

<sup>7</sup>Unlike either LFG or HPSG, or indeed some versions of Minimalism, Kayne (1999: 50) takes the alternative tack of arguing with respect to precisely this kind of Romance data that "prepositional complementisers do not form a constituent with the infinitival IP they are associated with". For a detailed response to Kayne's position, see Borsley (2001).

<sup>8</sup>There is one significant respect in which the infinitival markers differ from ordinary prepositions, namely that they do not combine with the preverbal clitics in the same way a preposition combines with the prenominal article. Thus, *à*/*de les voir* 'comp them see.inf' does not become \**aux*/*des voir* in the way that underlying *à*/*de les garçons* obligatorily becomes *aux*/*des garçons*. Standard accounts explain this by treating the clitic and the article as belonging to the category D and attributing the differential behaviour to a categorial distinction between a P and C/I, whereas Abeillé et al. follow traditional grammar and treat pronouns and articles as distinct categories with the phonological merger only applying to the sequence P + Art. However, as they observe in their footnote 9, decisive evidence one way or the other is hard to come by.

### Nigel Vincent & Kersti Börjars

way as our Swedish example (1a). For the infinitival construction, one option is to maintain the prepositional analysis, which entails a c-structure of the form in (16).

### (16) [PP [<sup>P</sup> *à* ] [VP *venir demain*] ]

This in turn would imply that diachronically the shift is not in the prepositional head but rather in an expansion of its f-structure to include xcomp as well as obl, so that there is a single lexical item with two alternate functional values depending on context. Alternatively, we have an IP with *à* defined as the value for the compform feature within its associated f-structure. The latter solution comes back to saying that there has been a diachronic shift at the categorial level, viz. P > C, and hence two distinct items.

The empirical evidence here is split. Latin prepositions did not govern infinitives, but there was a construction in which *ad* took a gerund as complement, thus *ad dicendum* 'towards, for speaking'. The change seems to have involved the loss of the gerund (in this function at least) and its replacement by the infinitive, itself also a verbal noun in origin. While this argues for *ad* and its Romance reflexes having retained the status of prepositions, the fact that there are in the modern languages alternations between prepositional infinitives and finite complements introduced by *que* 'that' argues for the shift from P to C. Thus, if the complement of the preposition *avant* 'before' is infinitival, it is introduced by *de*, and if it is a finite clause we have *que*, as in (17).


Whichever solution is in the end adopted, there is a further difference between the use of functional heads in LFG and Minimalism that needs to be emphasised. In the remark quoted above, Abeillé et al. refer to "LFG and GB". While it is true that in the latter, functional heads were for the most part restricted to C, T, I and D, at least one strand of Minimalism, the so-called cartographic approach developed

<sup>(17)</sup> French

### 7 Heads and history

by Cinque and others, takes the further step of decomposing heads like C into a set of subsidiary functional heads (Rizzi 1997). Within such an approach, the original simple functional head C is split into a series of separate heads, of which Force is the highest and Fin the lowest.<sup>9</sup> The item *de* in an example like (15b) or (17a) would be assigned to the Fin head whereas a finite complementiser like *que* in (17b) is located in Force. There are, however, two problems with moves of this kind. First, there is the obvious danger that, as the number of such heads expands, explanation is replaced by enumeration. The set of functional heads simply becomes an ever more fine-grained taxonomy. To take a recent example, (18) sets out the structure proposed in Munaro & Poletto (2014) for items meaning 'where' (construed as a PP 'at/to wh-place') in a range of Italian dialects (= their (7)).

(18) [*PPDirSource* da/di [*PPDirGoal* in [*PPDirPath* d [*DisjP* o/u [*StatP* [*DegreeP* [*ModeDirP* [*AbsViewP* [*RelViewP* [*DeicticP/ExistP* là/v/nd [*AxPartP* [*PP* [*P⁰*] [*NPplace/Restrictor* e [PLACE]]]]]]]]]]]

As they go on to note, "we assume that the whole extended projection in (7) is active even when a single lexically realized morpheme is present, irrespective of whether it occupies a high or low position" (2014: 292). When the constituent structures reach this order of complexity, it is reasonable to ask whether alternative approaches, in which not all aspects of meaning have to be driven through the syntax, are not worth considering. Moreover, diachrony adds a further difficulty: if, as we have seen and as also emerges in the Munaro & Poletto study and in related nanosyntactic research such as Roy & Svenonius (2009), the source of such heads lies in what were originally full lexical items, then the number of possible diachronic intermediate steps is potentially infinite, since there are no universally definable intermediate steps on the cline from lexical to grammatical.

## **8 Conclusions**

We are now in a position to draw some conclusions from the case studies we have been considering and in particular to consider the relevance of diachronic data for theory construction. Let us begin with the key point that this data set

<sup>9</sup> In Rizzi's original account there were three intermediate heads between Force and Fin, namely two different Top(ic) heads ranged respectively above and below an intermediate Foc(us) head. In subsequent work within the framework, the number of such heads has expanded considerably but, for the purposes of our argument, consideration of Rizzi's original proposal is sufficient.

### Nigel Vincent & Kersti Börjars

reinforces the standard conclusion that grammaticalisation has a clear directionality. Lexical items of various categories may become prepositions with a range of functions and they move on from there to become complementisers, thereby shifting from the domain of nominal marking to verbal marking. A natural question to ask therefore is whether such directionality follows from any independent properties of the frameworks we have been exploring. And in the case of both LFG and HPSG the answer is a clear no. There are no internal principles within their architectures which predict the direction of change. This is a notable difference when compared to Minimalism, where, as we noted at the outset, the fact that grammaticalisation changes show a directionality can be argued – and indeed has been argued, not least by Ian Roberts in a number of studies – to follow from the fact that Universal Grammar allows raising but not lowering as a derivational operation. However, even this principle would not account for our observation that prepositions become complementisers but not vice versa since PP and CP are typically different projections rather than one being the extension of the other.

Two other types of diachronic pattern that have been considered from a Minimalist perspective are so-called lateral grammaticalization and downwards grammaticalization. The classic instance of the former is the development of deictic markers into copular verbs (see Börjars & Vincent 2017 for discussion and references), where an item appears to jump across from the nominal to the verbal domain. Downward grammaticalization, by contrast, is to be seen when an item starts its grammatical existence in a higher position and evolves into something which occupies a lower position in the tree. A case in point is the discussion by Munaro (2016) of the development of complementisers in some Italo-Romance dialects, where an item that was originally in the higher Force head position comes to occupy the lower Fin position. The evidence of changes such as these suggests that directionality of derivation is not the key to the directionality of change.

The alternatives, therefore, are either to find other internal mechanisms of grammar, such as the Late Merge and Economy principles proposed by van Gelderen (2009; 2011), or to consider the driving force of change to be the external circumstances of language use, but to deploy the devices of formal syntax in order to model such changes as and when they are attested. Thus, if, over time, we find evidence of nouns evolving into prepositions, prepositions evolving into complementisers and prepositions evolving from lexical ("full semantics") to grammatical ("weak" semantics), but we do not have any attested cases of the reverse, we may reasonably ask: why not? The answer, we suggest, lies in the fact that nonfinite forms start out as nominal and shift to verbal as they are incorporated into the verbal paradigm. There is, by contrast, no corresponding nominalisation of

finite forms. In other words, the directionality follows from the content and contextual function of the constructions at issue and does not need to be ascribed to any principle of UG.

The constructions we have reviewed here also demonstrate that large scale categorial changes can – and given the diachronic evidence should – be broken down into smaller steps which in turn can be modelled using such formal constructs as weak and transparent heads and non-projecting words. Within frameworks like LFG and HPSG, however, such constructs are not required to respect universal principles of categorial hierarchy. And in particular within a parallel correspondence architecture such as that provided by LFG, changes in the different dimensions do not necessarily proceed at the same pace. This, of course, is a familiar result when it comes to (morpho)syntax and phonology, but even within the former dimension we can now see that an item may cease to co-occur with nominals without necessarily losing the marking properties of a preposition. What, on the other hand, all three systems discussed here share is a commitment to the formal modelling of linguistic structure. The relation between any formal account and a functional explanation for the existence or development of that account remains, by contrast, an open question.


## **Abbreviations**

# **Acknowledgements**

An earlier version of this paper was presented at the HeadLex16 conference in Warsaw in July 2016. Our thanks to those who commented on that occasion and to the anonymous reviewers of the present version.

Nigel Vincent & Kersti Börjars

## **References**


7 Heads and history


### Nigel Vincent & Kersti Börjars


7 Heads and history


# **Chapter 8**

# **Micro- and nano-change in the verbal syntax of English**

# Eric Haeberli

University of Geneva

# Tabea Ihsane

University of Geneva

The verbal syntax of English undergoes substantial changes in the Late Middle and Early Modern English periods. The outcome of these changes is a clear division between main verbs and auxiliaries with respect to their syntactic behaviour. On the basis of quantitative data tracing the diachronic development of the distribution of verbal elements with respect to adverbs, this paper argues that the path towards the present-day system with a separate syntactic class of auxiliaries involved several small-scale steps that can be considered to be of the micro- and nano-type in Biberauer & Roberts's (2012; 2016) terminology.

# **1 Introduction**

As is well known, the verbal syntax of English undergoes important changes in the transition from Middle to Early Modern English. On the one hand, finite main verbs stop moving to the inflectional domain (decline of V-movement, cf. Roberts 1985; 1993; Kroch 1989; Pollock 1989 among many others), and, on the other hand, auxiliaries start forming a clearly distinct class of elements (recategorization of auxiliaries, cf. e.g. Lightfoot 1979; 2006; Warner 1993). In this paper, we will examine how these two developments interact, and we will show that what has generally been treated as major syntactic changes may have involved smaller steps with brief periods of variation at what, in Biberauer & Roberts's (2012; 2016) terms, could be called the micro- and nano-parametric level.

Eric Haeberli & Tabea Ihsane. 2020. Micro- and nano-change in the verbal syntax of English. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 159–173. Berlin: Language Science Press. DOI: 10.5281/zenodo.4041208

### Eric Haeberli & Tabea Ihsane

Our evidence comes from the distribution of finite verbal elements and adverbs. Besides negation, adverbs have been considered as the main diagnostic for V-movement out of the VP to the inflectional domain, the assumption being that certain adverbs and negation are merged above the VP and that the occurrence of the verb to the left of these is a sign of V-movement whereas the occurrence of the verb to the right signals the absence of such movement (cf. Emonds 1978; Pollock 1989 among many others). In the literature on the loss of V-movement in the history of English, discussions have generally focussed mainly on negation and the rise of *do*-support. In Haeberli & Ihsane (2016), data involving adverbs are examined in detail, and it is shown that the two diagnostics for V-movement do not pattern alike. Whereas V-movement past adverbs declines relatively quickly between the middle of the 15th century and the middle of the 16th century, the loss of V-movement past negation starts only in the 16th century and takes well into the 18th century to be completed. On the basis of this contrast, Haeberli & Ihsane conclude that the loss of V-movement in the history of English is a two-step process (cf. also Han 2000; Han & Kroch 2000 for this claim based on different evidence). In the first phase, around 1500, V-movement to a high inflectional head is lost (T in Haeberli & Ihsane's analysis) whereas V-movement to a low inflectional head is maintained (Asp). This leads to a situation where V-movement past adverbs is lost while movement past negation still remains productive. Then, in the second phase, V-movement out of the VP is lost entirely and finite main verbs no longer occur to the left of negation.

Given that the first phase in the loss of V-movement starts in the 15th century, we would expect it to interact with the second major change affecting the verbal syntax in Early Modern English, i.e. the change in the syntactic status of auxiliaries. It is generally assumed in the literature that auxiliaries belong to the category V in early English, but that they are then reanalysed as belonging to a functional category in Early Modern English. For modals, this change has been situated approximately in the early 16th century (cf. e.g. Lightfoot 1979: 110; 2006: 31; Roberts 1993: 310f.). Since the decline of V-movement past adverbs already starts in the 15th century, we would expect that auxiliaries first participate in this change, but that they stop doing so with the categorial reanalysis in the early 16th century. In the following section, we will examine the diachronic development of adverb placement with respect to auxiliaries in order to determine whether such an interaction between the decline of V-movement and the recategorization of auxiliaries can indeed be observed.<sup>1</sup>

<sup>1</sup>An anonymous reviewer suggests that the interaction between the loss of verb movement and the recategorization of auxiliaries should also be tested on the basis of subject–verb inversion

8 Micro- and nano-change in the verbal syntax of English

# **2 Adverb placement with different types of verbal elements**

Old and Early Middle English had relatively frequent occurrences of adverbs between a subject and a finite main verb (SAdvV order) due to a certain variability in subject and verb placement. This system is simplified in the course of the Middle English period, and the subject and the finite verb are increasingly adjacent. In structural terms and under the assumption that adverbs are diagnostics for V-movement, this development can be considered as a trend towards a Frenchstyle grammar in which the verb moves past adverbs to T and the subject occurs in Spec,TP in non-interrogative clauses (cf. Haeberli & Ihsane 2016: 531ff. for discussion). In the middle of the 15th century, however, this trend is inverted and the frequency of the word order SAdvV increases again. In the data presented by Haeberli & Ihsane (2016: 512), the rate of SAdvV measured against the total number of clauses with an adverb to the right of the subject reaches its lowest point in the period 1420–1475 (8.5%). This rate increases to 16.5% in the period 1475–1500 and to 37.3% in the period 1500–1525, both changes being statistically significant. This quick rise of medial adverb placement, which is followed by a certain stability, can be considered as a symptom of the loss of V-movement past adverbs. The fact that SVAdv is not entirely lost is due to an alternative option to derive this word order that is independent of V-movement and that remains in use until today (right-adjunction of the adverb in the traditional account). There are contexts, however, in which a word order option depends entirely on the presence of V-movement, and these contexts provide support for the hypothesis that V-movement past adverbs is lost around 1500 (cf. Haeberli & Ihsane 2016: 514–520). Adverb placement with finite main verbs can then be taken as a base-

contexts as found in questions, where the verb must move out of the VP to reach a higher V2 position in the CP-domain. However, it is not clear whether subject–verb inversion data would provide us with useful evidence for the purposes of our investigation. First, the Mainland Scandinavian languages suggest that finite verbs can still move to C even after the loss of V-to-T movement. And secondly, although standard generative accounts assume that a verb has to move through the inflectional domain on its way to C, direct movement to C would be conceivable in more recent frameworks (cf. e.g. Roberts 2012 for an approach that allows movement from one phase head to another (i.e. v-to-C); or cf. also approaches viewing V2 as a phonetic form (PF) phenomenon). Given these observations, it seems that the adverb data considered below provide more solid evidence for our purposes than subject–verb inversion data. However, it would no doubt be worth exploring the consequences of the findings in Haeberli & Ihsane (2016) and in this paper with respect to how V-movement to C developed in the history of English, but we will have to leave this issue for future research.

### Eric Haeberli & Tabea Ihsane

line against which to examine the development of auxiliaries.<sup>2</sup> Assuming that auxiliaries have the same categorial status as main verbs in early English, we expect their distribution with respect to adverbs to develop in parallel until the two types of elements become categorially distinct.

### **2.1 Modals**

Haeberli & Ihsane (to appear) examine the development of the distribution of modals with respect to adverbs. One of their findings is that, throughout Old and Middle English, the frequency of the order SAdvM(odal)V measured against SMAdvV and SMVAdv is considerably lower than the frequency of SAdvV measured against SVAdv. Although this quantitative difference could be interpreted as suggesting that modals do not occur in the same structural position and thus do not have the same categorial status as main verbs already in early English, Haeberli & Ihsane show that such a conclusion is not necessarily correct and that other factors may play an important role in the quantitative contrast. If this is the case, the comparison should rather focus on the general diachronic trajectories, and in this respect the two contexts turn out to match up to 1500. This is shown in Table 8.1, which presents data for adverb placement with respect to finite modals from 1350 to 1650 (from Haeberli & Ihsane to appear) and compares them with main verbs (frequencies in the final column from Haeberli & Ihsane 2016: 512).<sup>3</sup>

<sup>2</sup> In this paper and in Haeberli & Ihsane (2016), we include data involving any type of adverb in our counts. A reviewer considers this as potentially problematic as different types of adverbs might occur in different positions in the clause structure. Two observations can be made here. First, given that one of the crucial word orders examined below (SAdvAuxV) occurs with very low frequencies, a further subdivision of the data according to adverb types would not allow us to obtain any meaningful results, possibly even if we extended our corpus substantially. However, even if the amount of available data were larger, it is not clear whether adverb type indeed interferes in a significant way with the change considered here. Data involving finite verbs and adverbs are more abundant, and with those no clear adverb type effect can be detected (cf. Haeberli & Ihsane 2016: 516–520, 524–525 for discussion).

<sup>3</sup>The data in the tables in this paper are based on the following three parsed corpora: *The Penn– Helsinki Parsed Corpus of Middle English 2* (PPCME2 (1150–1500); Kroch & Taylor 2000), *The Parsed Corpus of Early English Correspondence* (PCEEC (c. 1410–1695); Taylor et al. 2006), and *The Penn–Helsinki Parsed Corpus of Early Modern English* (PPCEME (1500–1700); Kroch et al. 2010). Overlaps between PCEEC and PPCEME have been removed. The data cover all main and subordinate clauses with an overt subject and a one-word AdvP of any type. In addition to the elements referred to in the word order patterns (S, A(dv), V, M(odal), *be*, *have*), further constituents such as objects, adjuncts or, in clauses with an auxiliary, a second non-finite element may occur in any position in these clauses. An anonymous reviewer points out that it might be problematic to collapse main clause and subordinate clause data as the two clause types may


Table 8.1: The distribution of finite modals and adverbs following an overt subject in Late Middle and Early Modern English (PPCME2, PCEEC, PPCEME)


The periods 1350–1420 and 1420–1475 show the end of a gradual decline in the frequencies of SAdvMV and SAdvV order from Old English onwards, with the low point being reached in 1420–1475.<sup>4</sup> In the following period 1475–1500, we see a significant increase of SAdvX both with modals ( 2 : 11.00, < 0.001) and with main verbs ( 2 : 36.35, < 0.001). But whereas this rise continues with main verbs in the period 1500–1525 and the frequencies then remain relatively stable, the rate of SAdvMV order drops in a statistically significant way to the level before 1475 ( 2 : 3.94, < 0.05). After that, there are two small increases

behave differently with respect to adverb placement. For the auxiliary data, this concern does not seem to be warranted. If we take all main and subordinate clauses containing an adverb and a finite auxiliary (excluding copula *be*) and we measure the rate of SAdvAuxV order in the two clause types separately, we can observe that between 1350 and 1650 the frequency of SAdvAuxV order indeed tends to be slightly higher in subordinate clauses but that this contrast is statistically significant in only one of the subperiods (1525–1550). To collapse main and subordinate clauses does therefore not seem to alter the general diachronic picture we obtain and it has the advantage of increasing the sample sizes. As for the baseline with finite main verbs, the clause type difference is somewhat more important (cf. Haeberli & Ihsane 2016: 524, fn. 52) in that SAdvV order is significantly more frequent in subordinate clauses in 5 of the 9 subperiods between 1350 and 1650. However, the general diachronic trajectory is similar, with SAdvV order sharply rising in the periods 1475–1500 and 1500–1525 in both clause types and with the frequencies then remaining, with some fluctuations, at the same level. For our purposes, it is this general diachronic picture that is essential. Distinguishing clause types would not alter our conclusions in any substantial way.

<sup>4</sup> For the Old English and Early Middle English data, cf. Haeberli & Ihsane (2016: 512; to appear).

### Eric Haeberli & Tabea Ihsane

with SAdvMV order but neither of them reaches statistical significance.<sup>5</sup> Finally, SAdvMV stabilizes at a slightly lower level.

From a structural point of view, the developments in Table 8.1 can be interpreted as follows. As shown by Haeberli & Ihsane (2016), the increase in SAdvV order with main verbs around 1500 is best analysed as a symptom of the loss of V-movement. The same could then be said for the parallel development with modals in the period 1475–1500. At this point, modals still have the status of verbs and V-movement past adverbs therefore declines, leading to an increase in SAdvMV order. In the following period, however, modals start being reanalysed as elements merged (presumably relatively high) in the functional domain and the order SAdvMV therefore declines again. This analysis thus identifies exactly the same moment in time for the recategorization of the modals (i.e. the early 16th century) as earlier proposals made in the literature on the basis of entirely independent evidence (cf. e.g. Lightfoot 1979: 110; 2006: 31; Roberts 1993: 310f.).

### **2.2** *be*

Let us now consider the behaviour of other auxiliaries with respect to adverb placement. In early English, auxiliary *be* can co-occur with a main verb in the present participle form or the past participle form, and with the latter both in the active and the passive voice. Our corpus contains too few examples with present participles and the active voice to allow for meaningful separate quantitative analyses. Table 8.2 therefore combines the three contexts and thus covers clauses with finite *be* and any non-finite main verb.

As with modals, we see an initial decline in SAdv*be*V order. However, in contrast to the modals, the low point of 0.9% is reached only in the period 1475–1500 rather than in the period 1420–1475. But subsequently we see the same quantitative pattern as with modals: a rise to 3.7% followed by an immediate decline to 1.3%.<sup>6</sup>

The development of auxiliary *be* can now be compared to that of copula *be*. Table 8.3 presents data involving copula *be* followed by some non-verbal predicate.<sup>7</sup>

<sup>5</sup>Comparisons of the different periods give the following results: 1500–1525 vs. 1525–1550: <sup>2</sup> = 0.52; < 0.5; 1525–1550 vs. 1550–1575: <sup>2</sup> = 3.75, = 0.053; 1500–1525 vs. 1550–1575: <sup>2</sup> = 3.52, = 0.061.

<sup>6</sup>These two developments do not quite reach statistical significance, however (rise in 1500– 1525: two-tailed Fisher exact test, = 0.057; decline in 1525–1550: two-tailed Fisher exact test, = 0.073).

<sup>7</sup>Clauses with an elided predicate are not included. Furthermore, we also excluded clauses of the type *It so is that …* as some early texts use them repeatedly without variation in adverbial placement and the regular occurrences of these clauses would distort the general picture somewhat.


Table 8.2: The distribution of auxiliary *be* and adverbs following an overt subject in Late Middle and Early Modern English (PPCME2, PCEEC, PPCEME)

Table 8.3: The distribution of copula *be* and adverbs following an overt subject in Late Middle and Early Modern English (PPCME2, PCEEC, PPCEME)


Once again, we see an initial decline which, as in the case of auxiliary *be*, reaches its low point in the period 1475–1500 with 1.4% SAdv*be* order. Then, there is a statistically significant rise to 9.4% (two-tailed Fisher exact test, = 0.007) and a subsequent decline that is gradual over several periods.

### Eric Haeberli & Tabea Ihsane

### **2.3** *have*

Finally, consider adverb placement in clauses with the finite auxiliary *have* and a main verb in the past participle form. The relevant quantitative data are provided in Table 8.4.


Table 8.4: The distribution of auxiliary *have* and adverbs following an overt subject in Late Middle and Early Modern English (PPCME2, PCEEC, PPCEME)

The rate of SAdv*have*V is already very low in the initial period 1350–1420. It then remains low up to 1500 and rises in two steps to 5.3% and 7.4%. Whereas the first increase is not statistically significant, the difference between 1475–1500 and 1525–1550 is ( 2 : 5.04, = 0.024). After 1550, the rate of SAdv*have*V declines. The change is not statistically significant if we compare adjacent periods but the contrast between the periods 1525–1550 and 1575–1600 is clearly significant ( 2 : 10.01, = 0.002).

As with *be*, we may now compare the auxiliary data with those for the main verb uses. Table 8.5 shows the distribution of main verb *have* with respect to adverbs.

The frequency of SAdv*have* order declines until the end of the 15th century. It then rises in the following three periods and remains stable around 20% until 1650. Thus, up to 1550, auxiliary *have* and main verb *have* undergo similar developments.

### **2.4 Discussion**

Figure 8.1 summarizes the findings reported in Tables 1 to 5. The dates for the different data points correspond to the middle of each period distinguished in the tables (e.g. 1448 for the period 1420–1475).


Table 8.5: The distribution of main verb *have* and adverbs following an overt subject in Late Middle and Early Modern English (PPCME2, PCEEC, PPCEME)

Figure 8.1: Frequency of pre-verbal/pre-auxiliary/pre-copula placement of adverbs in Late Middle and Early Modern English

### Eric Haeberli & Tabea Ihsane

One might wonder whether these low-frequency data, where potentially relevant differences occasionally lack statistical significance, allow us to draw any reliable conclusions. Although it is impossible to fully dispel such concerns without substantially extending our database, it is nevertheless extremely striking how regular the quantitative patterns in Figure 8.1 are. With each type of auxiliary and copula *be*, we can first detect a phase of decline in adverb placement to the left, then a very brief rise of this word order, and finally another decline. This pattern seems to be too regular to be entirely accidental.

Interestingly, this common pattern does not occur entirely in parallel across the different contexts. SAdvMV order (circled data points in Figure 8.1) rises together with SAdvV in the period 1475–1500. It immediately declines again in the period 1500–1525 while SAdvV keeps rising. As for *have* and *be* (rectangle and squares in Figure 8.1), their frequencies for adverb placement to the left remain low in the period 1475–1500 (rectangle in Figure 8.1). The rise occurs in the period 1500–1525 and is thus delayed by one period compared to modals and main verbs. Finally, the decline of SAdv*be*(V) order is also delayed by one period compared to modal verbs (1525–1550 rather than 1500–1525) and the decline with SAdv*have*V starts even later (squares corresponding to peaks in Figure 8.1). Thus, we have the sequence main verb/modals > *have*/*be* for the rise of SAdvX order and the sequence modal > *be* > auxiliary *have* for the decline of SAdvX.

These observations suggest that both the decline of V-movement and the recategorization of auxiliaries take place stepwise, with different lexical items being affected by the changes at different times. Let us consider V-movement first. In Minimalist terms, the increase of SAdvX order can be related to the loss of one or several unvalued formal features on V and of a V-feature on one or several corresponding functional heads, these features being required to establish the Agree relation that gives rise to V-movement (cf. Haeberli & Ihsane 2016: 528ff. for an account of main verbs). We will not go into the details of a feature-based analysis here and will simply refer to the unvalued feature(s) on V as F. In early English, all verbal elements are of the category V and they carry F as they all undergo movement. The initial rise in SAdvX order with main verbs and modals in Figure 8.1 suggests that a new variant of these elements emerges in the period 1475–1500 that lacks F and that leaves main verbs and modals in a lower position. At this point, the option without F is not available yet for *have* and *be* both in their main verb and auxiliary uses. This situation corresponds to what, following Biberauer & Roberts (2012; 2016), we could call nanoparametric variation. A change in the formal features of V affects almost all elements of this category

### 8 Micro- and nano-change in the verbal syntax of English

with the exception of two specific lexical items.<sup>8</sup> This nanoparametric variation is very short-lived, however, and in the period 1500–1525 variants of *have* and *be* appear that lack F and this leads to an increase in the rate of SAdvX order.

At that point, modals are already a step ahead again. The frequency of SAdvM drops, suggesting, as discussed above, that they are reanalysed as being merged directly in the functional domain. If parameters are conceived of as changes in formal-feature specifications of heads and we include categorial features among the class of formal features, we could compare the reanalysis of modals to what Biberauer & Roberts (2012; 2016) call a microparametric change: A subclass of verbal elements (modals) is affected by a change with respect to a formal feature.<sup>9</sup> The class of items affected by recategorization is then gradually extended. First, in the period 1525–1550, SAdv*be*(V) order declines with *be*, suggesting that *be* is also reanalysed as being functional rather than of the category V.<sup>10</sup> Finally, auxiliary *have* can be argued to be recategorized in the period 1550–1575 when SAdv*have*V declines. *Have* in its use as a main verb, however, remains a member of the category V and, just like with main verbs, the variant lacking F is strengthened, thereby giving rise to increasing occurrences of SAdv*have* order. These steps could be considered as being of the nanoparametric type as they involve individual items that are reanalyzed (first *be*, then auxiliary *have*).

Before concluding, let us briefly consider why the changes described above may have proceeded the way they did. For the first contrast (delay in the decline of V-movement with *be*/*have*), we do not at present have a plausible explanation. As for the different steps with the decline of SAdvX order, however, the following scenario would be conceivable. In line with various proposals made in the literature, we can assume that, by the end of the Middle English period, recategorization of the modals becomes a natural consequence of developments affecting their status within the category of verbs. From a morphological point of view, modals become distinctive because, as the only surviving members of

<sup>8</sup>Biberauer & Roberts (2012) suggest that a similar scenario holds for the very final phase in the loss of V-movement in English, when some specific verbs such as *know* or *doubt* preserve a feature on V triggering V-movement past negation longer than other verbs.

<sup>9</sup>Whether all modals change at the same time, or whether there is some earlier "leakage" into the functional domain with some specific modals, and therefore some nano-change (cf. Roberts & Roussou 2003: 43), cannot be determined on the basis of our data as the number of examples per modal per period is fairly small (but cf. Haeberli & Ihsane to appear for some data for *may*, *shall*, and *will*, which do not show any substantial difference in their diachronic development).

<sup>10</sup>It is likely that, after the reanalysis, *be* is not merged in the same position as the modals and that not all uses of *be* are merged in the same position. Furthermore, once auxiliaries have been recategorized, they may undergo movement within the functional domain. We have to leave a detailed investigation of these issues for further research.

### Eric Haeberli & Tabea Ihsane

the present-preterite class of verbs, they lack 3sg agreement morphology and because their past forms become opaque from a semantic point of view as they no longer necessarily express past-time reference (Lightfoot 1979; 2006). Furthermore, as Roberts (1985: 42) points out, with the loss of the subjunctive/indicative distinction in Middle English, "the modals commonly appeared as 'semantic substitutes' for verbal inflection" and they "were being construed as clausal operators, like subjunctive inflection". Finally, as Roberts & Roussou (2003) argue, important morphological evidence for a biclausal structure with modals is lost once their complements no longer carry infinitival morphology. Given these developments, the reanalysis of the modals as functional elements in a monoclausal structure could be considered as a natural response to the "emptying" of the functional domain due to the decline of V-movement.

The reanalysis of the modals can then be argued to have paved the way for analogical processes with the other verbal elements that are of a functional nature and do not assign thematic roles. The SAdvX data suggest that *be* is reanalysed first as being merged in the functional domain (1525–1550) and auxiliary *have* somewhat later (1550–1575). A possible explanation for the delay with *have* could be that main verb uses and auxiliary uses seem to influence each other. This is first observed in the period 1475–1500, where SAdvX with main verb *have* and SAdvX with auxiliary *have* continue declining together at a point when this word order already increases with other main verbs. Similarly, it could be argued that SAdvX with auxiliary *have* keeps increasing in the period 1525–1550 under the influence of main verb *have*, which, at this point, starts patterning more with other main verbs. It is only in the following period that auxiliary *have* aligns with other auxiliaries rather than with other uses of *have*.

## **3 Conclusion**

The verbal syntax of English undergoes substantial changes in the Late Middle and Early Modern English periods. The outcome of these changes is a clear division between main verbs and auxiliaries with respect to their syntactic behaviour. On the basis of data tracing the diachronic development of the distribution of verbal elements with respect to adverbs, we have argued in this paper that the path towards the present-day system may have involved several small-scale intermediate steps that can be considered to be of the micro- and nano-type in Biberauer & Roberts's (2012; 2016) terminology. First, in the phase of decline of V-movement past adverbs, two specific lexical items (*be* and *have*) undergo the change only after a short delay. Then, in the phase of the recategorization of auxiliaries as

### 8 Micro- and nano-change in the verbal syntax of English

functional elements, modals are affected first, followed by auxiliary and copula *be*, and finally by auxiliary *have*. Each of these intermediate stages is very shortlived, confirming Biberauer & Roberts's (2016) suggestion that micro- and, in particular, nano-variation are highly prone to change. The clear auxiliary/main verb distinction that characterizes Present-Day English syntax can thus be argued to have emerged from a sequence of small-scale changes in a way that is reminiscent of lexical diffusion effects.

## **Abbreviations**


## **Acknowledgements**

It is with great pleasure that we dedicate this paper to Ian Roberts. For the first author, this paper is a token of his immense gratitude to Ian for having played an important role in sparking initial interest in diachronic syntax.

Earlier versions of this material were presented at the 8th days of Swiss linguistics (University of Zurich), the 16th diachronic generative syntax conference (DiGS16, Hungarian Academy of Sciences, Budapest), and the 18th international conference on English Historical Linguistics (ICEHL18, KU Leuven). We thank the audiences at these conferences as well as two anonymous reviewers and Theresa Biberauer for their comments and suggestions. This work was supported by the Swiss National Science Foundation under grant no. 143302.

## **References**


### Eric Haeberli & Tabea Ihsane


### 8 Micro- and nano-change in the verbal syntax of English


# **Chapter 9**

# **"Them's the men that does their work best": The Northern subject rule revisited**

Eric Fuß Ruhr University Bochum

# Carola Trips

University of Mannheim

This paper addresses a set of issues concerning the analysis and historical development of the so-called Northern subject rule (NSR), which characterises many northern varieties of English. Based on an investigation of NSR effects in the Northern Middle English *York plays*, we present a new account of the NSR that combines a DM analysis of the relevant agreement markers with the idea that inflectional heads lacking phi-features ("blank generation", Roberts 2010) may acquire agreement features via the incorporation of adjacent subject pronouns. Based on this analysis, we suggest a new scenario for the historical development of the NSR, arguing that after the breakdown of the Old English agreement system, the NSR developed via dialect contact between northern and southern varieties. More precisely, we propose that syncopated verb forms (resulting from southern Agr-weakening) were integrated into the northern grammar as marked agreement formatives that contrasted with the generalized *-s*-ending.

# **1 Introduction**

This paper deals both with (i) synchronic properties and (ii) the diachronic development of a peculiar agreement phenomenon that characterizes many northern dialects of (British) English. In varieties spoken in (central) northern England (in

Eric Fuß & Carola Trips. 2020. "Them's the men that does their work best": The Northern subject rule revisited. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 175–219. Berlin: Language Science Press. DOI: 10 . 5281 / zenodo . 3972844

### Eric Fuß & Carola Trips

particular, Northumberland, Cumberland, Durham, and Westmorland), Scotland and northern Ireland (see Pietsch 2005a,b for details concerning the geographical distribution), the distribution of the verbal agreement formative *-s* is governed by what is today commonly called the *Northern subject rule* (NSR, Ihalainen 1994: 221; in earlier work, the same phenomenon has also been dubbed the "personal pronoun rule", McIntosh 1988, or "Northern present tense rule", Montgomery 1994).<sup>1</sup> Many northern English dialects have in common that the *s*-inflection, which is confined to 3sg present tense indicative in Standard English, has a wider distribution and may (variably) occur in other contexts as well (with plural subjects, in particular, but in certain varieties also with 1sg and 2sg; see Pietsch 2005a,b for NSR dialects with different inventories of inflections). Crucially, however, the realization of verbal agreement is subject to further conditions in the NSR dialects. The relevant varieties typically show the standard agreement pattern (3sg -*s*, zero ending elsewhere) in cases where the finite verb is directly adjacent to a pronominal subject, but whenever this configuration is not given, the generalized -*s* form occurs (cf. Murray 1873, Berndt 1956, McIntosh 1988, Montgomery 1994, Schendl 1996, Corrigan 1997, Börjars & Chapman 1998, Klemola 2000, Pietsch 2005a,b, de Haas 2011, amongst others). In other words, the realization of verbal agreement is sensitive to (i) the type of subject (pronouns vs. full DP subjects) and (ii) the position of the subject.

(1) *Northern subject rule* (NSR): A finite verb (in the present indicative) takes the ending *-s* except when it is directly adjacent to a non-3sg pronominal subject (*I/you.sg/we/you.pl/they*).

As a result, the NSR dialects exhibit a three-way distinction dependent on type and position of subject: if the subject is a full DP, the finite verb takes the -*s* and adjacency is no determining factor (see 2a). If the subject is a non-3sg pronoun and adjacent to the finite verb, the finite verb doesn't take the -*s* ending (see 2b) and instead appears without overt inflection; if the subject pronoun is not adjacent to the verb, the -*s* occurs again. The adjacency effect is triggered by adverbs that intervene between the subject and the finite verb as shown in (2c) and in cases of VP coordination, as in (2d). A related effect can be observed in relative clauses such as (2e), where the relativizer intervenes between the pronominal head and the finite verb.

<sup>1</sup> See Godfrey & Tagliamonte (1999) for a similar pattern in Devon English spoken in the southwest of England.

9 The Northern subject rule revisited

	- b. they sing
	- c. they only sing*s*
	- d. they sing and dance*s*
	- e. they that sing*s* ('they who sing')

The NSR also applies in cases where the pronoun is right-adjacent to the finite verb, i.e., in cases of subject-verb inversion:

	- b. *Does* the birds sing?

The differences between the Standard English agreement system and the NSR dialects are schematically summarized in Table 9.1. 2

Table 9.1: Verbal inflection (present tense), Standard English vs. Northern varieties + NSR


The kind of NSR as defined in (1) and illustrated in Table 9.1 has been reported for historical stages of Northern varieties of English (cf. e.g. Cowling 1915 on the dialect of Hackness in North-Yorkshire; Montgomery 1994 on Old Scots and northern ME/EModE), but does not seem to exist in this 'pure' form anymore today. Present-day varieties typically exhibit some amount of variation concerning the distribution of -*s* (cf. Montgomery 1994; Britain 2002; Pietsch 2005a,b; Adger & Smith 2010; Buchstaller et al. 2013; Childs 2013): With the exception of (i) 3sg

<sup>2</sup>As indicated in Table 9.1, in those dialects that have retained some reflex of the original 2sg pronoun *thou*, the 2sg pronouns typically behave on a par with 3sg forms in that they always trigger *s*-marking on the verb, Pietsch (2005b: 76). This observation will be addressed in more detail below.

### Eric Fuß & Carola Trips

subjects (which invariably trigger -*s*) and (ii) non-3sg pronouns adjacent to the verb (which strongly disfavour -*s*), the use of the -*s*-ending may vary with both nominal and pronominal subjects. To account for this kind of variation, it is often assumed that the constraints concerning type and position of subject are two separate (and competing) conditions (Montgomery 1994; Pietsch 2005a,b): Little or no variation obtains when there is no conflict between the constraints (i.e., with (i) 3sg subjects and (ii) non-3sg pronouns adjacent to the verb), while variable agreement patterns emerge in other contexts (e.g., with non-3sg pronouns that fail to be adjacent to the verb; more generally, non-adjacency of subject and verb generally seems to favour the use of -*s*, cf. Pietsch 2005b for details). Still, we think that it is important to understand the somewhat idealized system in Table 9.1, which can be taken to represent the historical basis from which the present-day dialects developed.

In the literature, a number of analyses have been put forward to explain the synchronic (and diachronic) facts (cf. Henry 1995 on Belfast English, Börjars & Chapman 1998, Hudson 1999, Pietsch 2005a, de Haas 2008; 2011, de Haas & van Kemenade 2015, Tortora & den Dikken 2010 on related phenomena in Appalachian English, Adger & Smith 2010 on the variety of Buckie in North-East Scotland). However, as pointed out by Pietsch (2005a: 180), most of these proposals focus on either the type of subject or position of subject constraint and therefore typically miss a subset of the relevant descriptive generalizations (cf. Pietsch 2005a and de Haas 2011 for extensive discussion).<sup>3</sup> This can be illustrated with the analysis proposed by Henry (1995) for so-called "singular concord" in Belfast English (basically the same account is adopted by de Haas 2008 to analyze NSR effects in the northern varieties more generally). Henry assumes that there is a link between morphological case marking and the subject's ability to trigger agreement on the verb. More precisely, she claims that only elements that are clearly marked as nominative (the pronouns *I*, *we*, *he*, *she*, *they*; *you* is treated as an exception) move to SpecAgrsP and trigger "standard" agreement on the verb (i.e., 3sg -*s* vs. zero in all other contexts). In contrast, full DP subjects occupy SpecTP, from which they cannot trigger verbal agreement, leading to insertion of the default ending -*s*, which is analyzed as a pure (present) tense marker:<sup>4</sup>

<sup>3</sup>Pietsch himself proposes a usage-based account of the data which captures the variable agreement facts in present-day NSR varieties in terms of competing lexicalized constructions but misses the morphological generalization that -*s* is the underspecified exponent in the relevant systems. See also Adger & Smith (2010: 1122f.) for critical discussion.

<sup>4</sup>Henry seems to assume that 3sg -*s* and default -*s* are separate markers, which happen to be homophonous. To account for variable -*s*-marking with phrasal subjects, she assumes that full DP subjects may optionally carry nominative (instead of default) Case, which licenses movement to SpecAgrsP.

### 9 The Northern subject rule revisited

(4) a. [CP [AgrsP They [Agrs′ are [TP [<sup>T</sup> ′ T [VP going]]]]]] b. [CP [AgrsP [Agrs′ [TP The teachers [<sup>T</sup> ′ is [VP busy]]]]]]

This approach accounts for the type-of-subject condition, but it does not seem to have much to say about the adjacency condition that characterizes all other NSR varieties.<sup>5</sup> Moreover, Henry's account makes use of a number of non-standard assumptions and stipulations (e.g. concerning the optional presence of nominative Case on phrasal subjects), which does not seem to be particularly attractive on conceptual grounds. Recently, de Haas (2011) and de Haas & van Kemenade (2015) have put forward an update of Henry's analysis that includes a set of extra assumptions that take care of the adjacency condition. De Haas and de Haas & van Kemenade maintain the idea that only pronominal subjects occupy the specifier of a functional agreement head located above TP (de Haas 2011: SpecFP; de Haas & van Kemenade 2015: SpecAgrsP) whereas nominal subjects occur in a lower position (SpecTP) from where they cannot induce agreement. The adjacency effect is then captured by assuming that the (post-syntactic) realization of agreement on the finite verb (situated in T in ME, but presumably in an even lower position in the present-day varieties) is blocked by material that intervenes between AgrS/F and T and interrupts the transfer of agreement features from AgrS/F to T (which de Haas 2011: 166 analyzes as an instance of morphological merger, basically following Bobaljik 2002).<sup>6</sup> In all cases where the finite verb cannot acquire a set of valued agreement features, the resulting non-inflected verb is repaired by the (post-syntactic) insertion of the default inflection -*s*.

While this kind of mixed approach successfully describes the basic facts pertaining to the NSR, it still misses a couple of generalizations and raises certain issues from the perspective of more recent developments in the theory of syntax. First of all, it is based on the traditional assumption that subject-verb agreement is established in a spec-head relation and therefore does not translate easily into


<sup>5</sup>Note that the distribution of -*s* is also subject to an adjacency effect in Belfast English. However, the outcome of the adjacency condition seems to differ from what we have seen so far in that -*s*-marking is blocked when an adverb intervenes between a phrasal 3pl subject and a finite auxiliary (see Adger & Smith 2010: 1116ff. for discussion of the difference between Belfast English and other (Scottish/Northern English) NSR varieties).

<sup>6</sup>The authors further assume that this additional condition has been dropped in a number of varieties which exhibit the subject condition only (i.e., where pronominal subjects generally trigger a special form of agreement).

### Eric Fuß & Carola Trips

more recent models where agreement is taken to result from the operation Agree, that is, a configuration where a functional head with unvalued Agr-features ccommands the agreement controller (i.e., the subject in the case at hand). Second, an approach that maintains that there is a close connection between the NSR and multiple subject positions has to assume that there are still two different subject positions in the present-day NSR varieties. However, it is far from clear whether this consequence is supported by the facts. At least at first sight (abstracting away from the NSR), there does not seem to be a huge difference between Northern dialects and Standard English with regard to the structural position of pronominal and nominal subjects. In addition, the analysis raises the question of why adverbs intervening between the subject and the verb trigger an adjacency effect in Northern English but not in Standard English. To account for this empirical fact, de Haas (2011) assumes that adverbs have a completely different syntax in the NSR varieties: According to her analysis, adverbs occupy specifiers of separate functional projections in the Northern varieties (the heads of which block morphological merger of Agr and the finite verb in T) while they are merely adjuncts in Standard English. Again, this seems to be unwarranted. Moreover, as already pointed out by de Haas (2011) herself, the idea that default inflection is another repair strategy (in addition to *do*-support) that rescues an otherwise uninflected verb by attaching -*s* to it invites the question of why the relevant varieties do not resort to *do*-support instead (note that *do*-support is regularly used in other such contexts such as negation etc. in the present-day NSR varieties).

In the literature dealing with the historical development of the NSR, basically three different lines of thinking can be discerned (in addition to traditional accounts that typically invoke some form of analogical extension, cf. e.g. Sweet 1871 for the idea that the zero/vocalic plural ending was generalized from the present subjunctive to the present indicative; see Pietsch 2005a,b and de Haas 2011 for comprehensive overviews and critical discussion). First, it has been proposed that the NSR reflects an Old English (OE) pattern where 1pl and 2pl agreement endings are reduced to schwa in inversion contexts (OE agreement weakening, cf. Rodeffer 1903; see below for further details and discussion). Second, several authors have put forward the claim that the NSR results from language contact with Celtic/Brythonic (cf. e.g. Klemola 2000), where similar differences between pronouns and DP subjects can be observed (e.g., in Welsh). Finally, the rise of the NSR is sometimes attributed to dialect contact with southern varieties (cf. e.g. Pietsch 2005a,b). It seems fair to conclude, however, that no commonly accepted single explanation for the development of the NSR has hitherto been proposed. More recently, de Haas (2011) and de Haas & van Kemenade (2015)

### 9 The Northern subject rule revisited

(partially based on findings of Cole 2014) have put forward a multi-factorial approach to the rise of the NSR which incorporates aspects of both languageinternal and language-external modes of explanation. They argue that the NSR developed when learners reanalyzed extensive variation in the plural endings of the present tense paradigm (-∅/*-e*, *-s*, *-th*, *-n*) as morphological marking of differential subject positions (i.e., a high position for pronouns linked to agreement, and a low position for other subjects giving rise to non-agreement/default inflection). According to the authors, this change was promoted by a conspiracy of factors, including agreement weakening in OE (-∅/*-e* instead of *-að* with 1pl, 2pl pronouns in inversion contexts), language contact with Brythonic Celtic (which presumably had an agreement system similar to present-day Welsh, which makes a systematic difference between pronominal and nominal subjects, see also Benskin 2011), language contact with Old Norse (which led to the erosion of the agreement morphology and presumably introduced the generalized *-s* marker), and the observation that pronominal subjects were particularly frequent in the context of (present) subjunctive forms of the verb, where the reduced ending - ∅/*-e* had already become the norm (due to loss of final *-n*).<sup>7</sup> While the scenario envisaged by de Haas (2011) and de Haas & van Kemenade (2015) represents the most comprehensive explanation of the historical development of the NSR so far, some problems and open questions remain. In particular, the authors' decision to focus solely on the plural part of the paradigm (cf. de Haas 2011: 60) is somewhat unfortunate since it excludes the possibility that a given morphological change is sensitive to properties of the paradigm as a whole. This applies to all other (diachronic) studies, which usually ignore the first and second person singular.<sup>8</sup>

In this paper, we attempt to narrow the empirical gap concerning the first and second person singular by taking a look at the behavior of relevant forms in a late Northern ME text (the *York (Corpus Christi) plays*) that is also affected by the NSR. In addition, we will explore the synchronic and diachronic implications of an alternative theoretical approach to the NSR sketched in Roberts (2010). Roberts suggests a new analysis of the NSR which is based on his notion

<sup>7</sup>The connection between the subjunctive mood and pronominal subjects can be traced back to the fact that both tend to be used in embedded clauses, cf. de Haas (2011).

<sup>8</sup>An exception is Fernández-Cuesta's (2011) study of the NSR in first person singular contexts in Early Modern English. She shows that in 15th and 16th century wills from Yorkshire the adjacency constraint was still operative, especially in the period between 1450 and 1499. Further, Fernández-Cuesta cites evidence from the *Linguistic atlas of Early Middle English* (LAEME) which shows that the adjacency constraint was operative in Early ME (although, it must be said that the numbers are very small). Overall, she comes to the conclusion that the emergence of the *-s*/-*eth* ending in the first person singular context should be seen as an extension of the adjacency constraint of the NSR.

### Eric Fuß & Carola Trips

of "blank generation": He assumes that inflectional heads can enter the syntactic derivation without content/phi-features. The NSR is then attributed to the idea that subject pronouns incorporate into the relevant Agr-head, endowing it with features that trigger the marked (zero) agreement ending on the verb (while -*s* signals the absence of agreement features). As a result, the verb can only appear in its inflected form (marked by ∅) when it is string-adjacent to a weak/clitic subject pronoun.

The paper is structured as follows. In §2 we briefly highlight a set of morphological issues relating to the proper analysis of the NSR (and singular forms, in particular) that are at least in part only rarely discussed in theoretical approaches to the NSR. §3 deals with the historical development of the NSR and shows that, although in OE times there is unfortunately no direct textual evidence for the rule (but see Cole 2014 on possible early traces of the NSR in Northumbrian OE), there are some indications that OE agreement weakening in inversion patterns might have played a role in its development. Further, we will take a closer look at (late) Northern ME, focusing on the status of the NSR in the *York (Corpus Christi) plays*, which exhibit an intermediate version of the NSR with a set of special and interesting properties. §4 presents an analysis of the NSR based on Roberts (2010) in terms of "blank generation". §5 brings together our theoretical claims and diachronic observations and shows that our analysis can shed new light on both the inner mechanics of the NSR and its historical development. §6 provides a brief concluding summary.

## **2 Unfinished business: Morphology problems**

The general morphological problem concerning the differences between Standard English and the northern varieties is what Pietsch (2005b) refers to as the "markedness paradox": while *-s* appears to be the marked inflection in Standard English, the situation in the NSR dialects is more complex, since with full DPs and non-adjacent subjects the *-s* affix seems to function as a default marker, whereas with subject pronouns adjacent to the verb the *-s* ending seems to mark the feature combination [−speaker, −pl] (at least in the conservative NSR varieties that have retained the original 2sg pronoun *thou*, compare the somewhat idealized system in Table 9.1). The "markedness paradox" presents certain problems for morphological analysis which are rarely (if at all) addressed in the existing literature on the NSR. In particular, it appears that the widespread assumption that *-s* is an underspecified default marker (possibly signalling tense and/or mood, cf. e.g. Henry 1995; Pietsch 2005b; de Haas 2011; de Haas & van Kemenade 2015) does

### 9 The Northern subject rule revisited

not suffice to capture its distribution in the above paradigm: If the *s*-marker represents the elsewhere case, then the zero marker must be specified for a certain combination of values for the features [person] and [number]. However, assuming standard (binary) feature systems such as [±speaker], [±hearer]/[±author in speech event], [±participant in speech event] for [person] and [±plural] for [number],<sup>9</sup> it turns out that it does not seem to be possible to describe the distribution of the zero marker in terms of a specific set of feature values: As the zero marker occurs in the singular (1sg) as well as in the plural, and with all three persons, it does not signal any person or number distinctions (compare Table 9.1). So we seem to face a (impossible) situation where a paradigm is made up by two seemingly equally underspecified markers. Note that this dilemma cannot be resolved by treating *s*-marking with nominal and non-adjacent subjects separately (e.g. by assuming that verbs with nominal/non-adjacent subjects fail to acquire a set of agreement features in the syntax), at least as long as we want to maintain the idea that there is only a single *s*-affix in the NSR varieties. Such an approach merely restates the "markedness paradox": Again, it would seem that while *-s* is the unmarked/default marker with nominal/non-adjacent subjects, it appears to be more specified than the zero ending in cases where a pronominal subject is adjacent to the verb (cf. the second column in Table 9.1). Without additional assumptions, this state of affairs also seems to be incompatible with the proposal of de Haas (2011) and de Haas & van Kemenade (2015) that in the NSR dialects, the relevant inflectional markers are not linked to specific phi-feature values, but are used instead to realize a minimal binary distinction between "real" subject-verb agreement (signaled by ∅) and default inflection (via insertion of *-s*).

In what follows, we will outline a new approach to the distribution of markers in the "classic" NSR varieties (cf. Table 9.1) that maintains the basic insight that the relevant dialects have only a single *-s* affix with a uniform specification. More precisely, we agree with previous work that *-s* is a completely underspecified default marker, which represents the elsewhere case. We take it that the zero marker (*sing-*∅), on the other hand, signals the presence of positive values for person or number features.<sup>10</sup> The resulting (binary) inventory of agreement markers can be described as follows:

<sup>9</sup>And excluding further options such as accidental homophony, or the possibility of disjunctive feature specifications (e.g., [+plural OR 1sg]), which we consider to be less attractive theoretically. However, see Adger & Smith (2010) for an account of variable agreement marking in a present-day dialect based on the idea that a particular surface form may be linked to different feature specifications.

<sup>10</sup>Alternatively, we might assume that the *-s* ending marks the absence of positive specifications for person or number. While this analysis seems to be a technical possibility, it fails to capture the elsewhere/default character of *-s* is the relevant varieties (e.g., its use under non-adjacency etc.).

### Eric Fuß & Carola Trips

	- b. elsewhere ↔ /-z/

Thus, if the process of vocabulary insertion detects a positive phi-feature value for person or number (which is only possible in connection with adjacent subject pronouns, see §4.2 for a syntactic analysis), the verbal agreement morpheme will be realized by the zero affix, while in all other cases the default marker *-s* is inserted.

As concerns the presence of the *-s* affix with 3sg pronouns, we follow the common idea that 3sg forms are characterized by the absence of (positive) specifications for [person] and [number] (cf. e.g. Benveniste 1966, Halle 1997, Noyer 1997, Harley & Ritter 2002). As a result, the elsewhere marker *-s* is inserted in all 3sg contexts.

But note that this morphological analysis faces a similar problem as previous approaches in that it apparently fails to account for the use of the *s*-affix in the context of 2sg (note that (5) should lead us to expect that the zero marker is used in 2sg contexts in connection with *thou*). To solve this puzzle, we would like to propose that the relevant agreement morphemes are subject to the following impoverishment rule that operates on the output of the syntactic derivation and reduces the feature content of agreement morphemes (on T) under adjacency with subject pronouns prior to the insertion of vocabulary items (NOM = nominative):<sup>11</sup>

(6) [+hearer] → ∅ / \_\_ pronoun[NOM]

As a result of (6), the feature [+hearer] is deleted when the finite verb is adjacent to a subject pronoun (i.e., part of the same phonological phrase/word). This serves to block insertion of the zero marker in the context of 2sg due to the absence of positively valued feature values, leading to systematic syncretism of 2sg and 3sg. In all other contexts, a positively valued feature remains ([+speaker] with 1sg, [+pl] with all plural forms), which triggers insertion of the zero marker.

This analysis not only accounts for the basic facts in the NSR dialects but also makes available a new perspective on 3sg *-s* in the present tense of Standard English. Similar to the NSR dialects, we might assume that this affix is not explicitly specified for [person] and [number]; rather, the distribution of *-s* and the zero form is sensitive to the presence/absence of positive feature values for [person]

<sup>11</sup>See Halle & Marantz (1993), Halle (1997), and Noyer (1997) on the workings of impoverishment rules, which typically lead to an extension of the contexts where underspecified markers can be used.

### 9 The Northern subject rule revisited

or [number] in the following way: The zero marker is inserted in all cases where a positive value for person or number can be detected (that is, in all contexts apart from 3sg); in the remaining context, *-s* is used (see Haeberli 2002, Roberts 2010 for a related analysis).

## **3 The historical development of the NSR**

### **3.1 Historical stages in the rise of the NSR**

In this section, we take a look at the historical development of the NSR. Before we deal with possible OE origins of the NSR in some more detail, we first outline its historical development from OE via ME to ModE (basically following Pietsch 2005a,b, de Haas 2011, and Cole 2014).

It is a well-known fact that during the transition from OE to ME nominal and verbal affixes became drastically reduced. The loss of inflections is particularly apparent in northern varieties. As shown by Berndt (1956) and Cole (2014), the erosion of the inflectional system first led to variation between several competing agreement markers, as evidenced in the *Lindisfarne gospels*, where 3sg and 1pl/2pl/3pl subjects may be cross-referenced on the verb variably by *-es*, *-as*, *-eð*, or *-að*. The default ending for 2sg is *-st* in OE; variants include *-est*, *-as*. In early Northern ME (NME), the OE 2sg *-est*, 3sg *-e/ðe* and plural forms *-a/ðe*/*-as* had already fallen together in the form *-e(s)*, which could be interpreted as an underspecified inflectional marker. Further, after the loss of vowels in the final syllable, Northern ME started to exhibit an opposition between 1sg -∅ and all other contexts (*-s*). At this point, new zero markers were introduced in the Northern ME varieties, eventually giving rise to the NSR. First, the zero marker was introduced in plural contexts where a finite lexical verb was adjacent to a subject pronoun, initially with 1pl/2pl and somewhat later with 3pl. In a further step, the *-s* affix was extended to 1sg pronouns (non-adjacent to the verb), presumably as a result of analogical pressure (Holmqvist 1922 assumes that the inherited null 1sg ending came to be perceived as being subject to the same mechanism that governed the alternation between *-s* and -∅ with plural forms). Finally, again probably via processes of analogy, the NSR was extended to forms of *be*, including *was/were*. <sup>12</sup> In some Northern dialects, 2sg *thou* was replaced with *you* (the original plural form) in the EModE period, which further broadened the scope

<sup>12</sup>Apparently, the use of *is* and *was* in the plural was never as categorical as the use of *-s* with lexical verbs (cf. e.g. Montgomery 1994). However, it seems that present-day dialects exhibit a different tendency, in that they preserve the NSR more strongly with forms of *be* (Pietsch 2005b: 12–13; but see Buchstaller et al. 2013 for different findings).

### Eric Fuß & Carola Trips

of the NSR.<sup>13</sup> Somewhat idealised, these stages of development are schematised and summarised in Table 9.2.

Table 9.2: Historical development of verbal inflection, Northern varieties


### **3.2 Old English**

Berndt (1956) makes the observation that a group of late Northumbrian texts, including the *Lindisfarne gospels*, the *Rushworth gloss*, and the *Durham ritual*, which are all dated to the mid-10th century, are the first OE texts showing the *-s* form variably with the *-ð*-ending. Berndt assumes that the triggering factor for the occurrence of this form are subject pronouns which could take over the function of person marking. What is implied in his comment is the special role subject pronouns play as opposed to full DP subjects, and his observations and assumptions hence foreshadow part of the NSR. Berndt's finding is corroborated by Cole (2014), the most comprehensive study of the earliest (Northumbrian OE) stages of the NSR so far. Cole provides an in-depth textual and linguistic analysis of the *Lindisfarne gospels*, focusing on the agreement system and early traces of the NSR, in particular. Using statistical methods, she is able to identify a set of factors that govern the variation between the various agreement endings. One of her most intriguing results is the observation that adjacency between the finite verb and a (plural) subject pronoun (usually cases of inversion) clearly favours *-s* over *-ð*. For the 1/2pl subject pronouns *we* and *ge* she finds that they occur 57% and 59% of the time with an *-s* ending on the finite lexical verb (Cole 2014: 112). Two examples are given here (cf. Cole 2014: 93):

<sup>13</sup>Concerning the empirical gap in studies of the NSR, Pietsch (2005b: 46) notes that the *LALME* (McIntosh et al. 2013) "[...] does not give detailed accounts or statistics regarding [...] any information about the first and second persons in the documents studied. The only information given per document is whether -s forms were used regularly or rarely."

9 The Northern subject rule revisited

	- b. huu how minum my wordum words *gelefes* believe *ge*. you.pl 'How will you believe my words?' (JnGl(Li) 5.47)

Thus, at first sight it seems that in late Northumbrian OE, there is already an early form of the NSR that differs from its later installments in that the (innovative) *s*-ending plays the role later assumed by the zero/vocalic endings. However, this conclusion is misleading, since the relevant markers have a different status in their respective paradigms. While zero represents the marked inflection in the NSR varieties, *-s* is clearly the elsewhere case in the Northumbrian agreement system (cf. Table 9.2). At least from a morphological point of view, the Northumbrian facts are more similar to southern OE agreement weakening, in that a less distinctive agreement marker is used in connection with adjacent pronominal subjects.<sup>14</sup> Recall that (late) southern OE exhibits an agreement alternation that is sensitive to subject type and the position of the finite verb (Jespersen 1949: 15; Quirk & Wrenn 1955: 42; Campbell 1959: 296; van Gelderen 2000). In cases where the 1pl/2pl subject pronouns *we* or *ge* directly follow the inverted finite verb, the regular agreement endings (present tense indicative/subjunctive *-að*, *-on*,*-en*) are replaced by schwa:<sup>15</sup>

(8) a. Ne neg *sceole* must *ge* you swa so softe easily sinc treasure gegangen. obtain 'You must not obtain treasure so easily.' (Battle of Maldon, p. 244, 1.59)

	- i. Jij you loop-t walk-2sg dagelijks daily met with een a hondje doggy over over straat. street
	- ii. Dagelijks daily loop-∅ walk jij you.pl met with een a hondje doggy over over straat. street

<sup>14</sup>This can perhaps be analyzed as an instance of featural haplology (Nevins 2012), where the verb's phi-set is deleted in cases where the verb is adjacent to another pure phi-set, i.e., a subject pronoun.

<sup>15</sup>Similar observations hold for early OHG (1pl), cf. Braune & Reiffenstein (2004: 262), and present-day Dutch (Ackema & Neeleman 2004: 193):

### Eric Fuß & Carola Trips

b. Hwæt what *secge* say *we* we be about þæm the coc? cook 'What do we say about the cook?' (AElfric's Colloquy on the Occupations, p. 188, 1.68)

As noted above, Rodeffer (1903) explicitly assumes that these syncopated forms were the direct source of the later affixless forms in the NSR varieties. Although there is no direct equivalent of the NSR in OE, the finding that the reduced *-e* affix occurs in inversion contexts might have contributed to the development of the NSR (see §5 for further discussion).

In §1 we have noted that in the studies hitherto presented, there is an empirical gap concerning the 1sg and 2sg forms. Since we are interested in the development of the full paradigm, we are going to include these two forms in the empirical study that we will present in the following section.

### **3.3 Middle English**

In a recent study of the NSR in ME, de Haas & van Kemenade (2015) investigate the agreement properties of full verbs, focussing on present tense indicative plural forms. The study is based on 36 texts dated between 1150 and 1350 taken from the LAEME corpus, as well as the sample of the *Northern prose rule of St. Benet* from the PPCME2 and a digitized version of a Lancaster romance. They identify 15 texts which display variation between *-*∅*/-e/-n* and *-s/-th* endings and show the strongest effects for the adjacency and type-of-subject condition in their corpus. Further, they locate a core area of the NSR in Yorkshire and note that in texts from more peripheral areas the adjacency condition is often weaker or even absent. They interpret this finding as evidence for an analysis that is based on different subject positions, as mentioned above in §2. A short glance at the sample of Richard Rolle's *Epistles* in the PPCME2 (Kroch & Taylor 2000) 16 confirms that both the adjacency and the type-of-subject condition seem to be quite well established:

(9) a. Some some þe the devell devil deceyves deceives þurgh through vayne vain glory, glory þat that es is ydil idle joy: joy when when any any has has pryde pride and and delyte delight in in þamself themselves of of þe the penance penance þat that *þai* they *suffer*, suffer of of gode good dedes deeds þat that *þai* they *do*. do of of any any vertu virtue þat that *þai* they *have*; have es is

<sup>16</sup>Richard Rolle of Hampole (ca. 1290–1349), Yorkshire, English hermit and mystic, was one of the first religious writers to use the vernacular. He was very well known at his time, and his writings were widely read during the 14th and 15th century.

### 9 The Northern subject rule revisited

glad glad when when *men* men *loves* loves þam, them sari sorry when when *men* men *lackes* lacks þam, them *haves* haves *envy* envy to to þam them þat that es is spokyn spoken mare more gode good of of þan than of of þam; them (ROLLEP,86.368)

b. He he says says þat that "he he lufes loves þam them þat that lufes loves hym, him and and *þai* they *þat* that *arely* early *wakes* wakes til till hym him sal shall fynde find him". him (ROLLEP,76.212)

As the Yorkshire area seems to have played an important role in the historical development of the NSR, it might be worthwhile to take a closer look at texts from that region to complement de Haas & van Kemenade's (2015) findings on plural forms with relevant data from the singular part of the agreement paradigm (with a focus on 1sg and 2sg; recall that 3sg usually does not take part in the NSR). Under the assumption that first and second singular pronouns are likely to occur in dialogues, we decided to survey the *York plays*, a ME cycle of 47 mystery plays dated between the mid-fourteenth century and 1463–1477, when the manuscript (MS. Add. 35290, British Library, London) was copied.<sup>17</sup>

As has been repeatedly pointed out in the literature (cf. e.g. Smith 1885; Cawley 1952; Beadle 1982; Burrow & Turville-Petre 2005; Johnston 2011), the *York plays* (even if they are the work of different authors) display an identifiably northern variety interspersed with some southern/Midlands influences (in particular concerning loanwords, spellings including combinations of southern spelling and a northern rhyme etc.).<sup>18</sup> In what follows, we will report our findings on properties of the agreement system as found in the *York plays*, focusing on 2sg (and 1sg) forms, and the distribution of the NSR. As already briefly mentioned above, the make-up of the agreement paradigm and the scope of the NSR depend in part on the inventory of pronominal forms. The pronominal system found in the individual plays is remarkably uniform, with variation being confined to differences in spelling. Table 9.3 gives an overview of the relevant subject forms (cf. Smith 1885: ixxii; Burrow & Turville-Petre 2005: 272, Johnston 2011).

<sup>17</sup>For our study we tagged the collection of plays which are part of *The corpus of Middle English prose and verse*. In addition, we conducted a full text analysis of all plays and looked through them manually, see references below.

<sup>18</sup>It is commonly assumed that dialectal features of south-east Midland and London varieties were introduced when the *York plays* were copied in the mid/late 15th century, cf. e.g. Beadle & King (1984).

### Eric Fuß & Carola Trips

Table 9.3: Subject pronouns as found in the *York plays*


As can be gathered from Table 9.3, the pronominal system of the *York plays* features the inherited 2sg subject pronoun *thou* in combination with the 3pl form *they* borrowed from Old Norse. We thus expect full verbs to take *-s* in 2sg contexts (in the present tense indicative).

The system of verbal agreement endings is characterized by a higher amount of linguistic variation, although it should be pointed out that the inventory of endings is quite limited.<sup>19</sup> In the present tense, the only significant residue of the formerly more elaborate OE/ME agreement paradigm is *-s*, which appears in a variety of different surface manifestations dependent on factors such as spelling preferences and phonetic context (e.g. *-s*, *-is*, *-es*, *-ys* etc.).<sup>20</sup> In addition to the variants of the *-s*-marker, present tense verbs appear with zero inflection, or *‑e*. However, there are reasons to believe (e.g. evidence from rhymes) that the latter is usually not pronounced, representing the residue of a former contrast which by the time the *York plays* were composed was confined to the writing (cf. e.g. Johnston 2011). This leaves us with a basically binary contrast between variants of *-s* and variants of the zero marker (-∅, *-e*). The situation is made more complex by the workings of the NSR (which widens the scope of the *-s*-marker) and the fact that there are cases where the *-s*-marker and the zero marker seem to vary freely. Table 9.4 gives a rough overview of the distribution of markers in the present tense (for the time abstracting away from variants of *-s* and -∅). Each cell of the paradigm contains the dominant (i.e. most frequent) marker, while competing variants are added in parentheses.<sup>21</sup>

<sup>19</sup>It is very likely that the linguistic variation found in the *York plays* is at least partially the result of the fact that the plays were composed by different authors. However, an in-depth investigation of the impact of authorship on the type of NSR found in the individual plays is well beyond the scope of the present paper.

<sup>20</sup>In addition, there are few 3sg forms ending in *-th* such as *haith* 'have-3sg', which clearly reflect Midlands/southern influence.

<sup>21</sup>Table 4 is based on the descriptions in Smith (1885: lxxii), Burrow & Turville-Petre (2005: 272), and Johnston (2011), which we have cross-checked with our own corpus-based studies.

### 9 The Northern subject rule revisited


Table 9.4: Verbal inflection in the *York plays* (present tense indicative)

As can be seen from Table 9.4, the agreement system found in the *York plays* exhibits some special properties that possibly shed some light on the historical development of the NSR. First of all, the NSR seems to be restricted to the plural part of the paradigm, whereas the realization of singular forms is not influenced by the position or (in the case of 3sg) type of subject. According to the standard view of the historical development of the NSR, this seems to be indicative of an early stage of the NSR, where the agreement alternation had not yet spread to singular forms (see §3).<sup>22</sup> Second, it appears that while non-adjacency may license *-s*-inflection in connection with pronominal subjects, zero-marked forms or forms marked with *-e* do also occasionally turn up in this context.<sup>23</sup> The variation between *-s* and zero in connection with pronominal subjects non-adjacent to the verb seems to suggest a tripartite agreement system with a distinction between pronominal subjects adjacent to the verb (which invariably trigger zero marking), pronominal subjects non-adjacent to the verb (which trigger either *-s* or -∅), and nominal subjects (which always trigger *-s*). In what follows, we will first add more data and examples, including some quantitative findings resulting from our corpus study, before we address the question of how the agreement system should be analysed in §4. As noted above, we will focus on forms which have been neglected in previous work on the NSR, i.e. 2sg in particular.

In contrast to later NSR-varieties, *-s* is only rarely found with 1sg forms, which strongly tend to exhibit *-e*/zero marking in the present tense independently of

<sup>22</sup>But note that there are few examples where *-s* seems to appear with a 1sg subject under nonadjacency, as shown in (11).

<sup>23</sup>The fact that the zero ending co-varies with *-s* under non-adjacency might be taken to represent an early stage of a development in which the type of subject constraint gradually gains more importance, eventually leading to zero marking of pronominal subjects independently of their position relative to the verb (contrasting with *-s*-marking of nominal subjects).

### Eric Fuß & Carola Trips

their position (adjacency/non-adjacency) relative to the subject. This is shown in (10). However, there are few examples where NSR effects do show up in connection with 1sg subjects (the majority of which in connection with 'have'), as in (11):<sup>24</sup>

	- b. A, a lorde, lord to to the thee *I* I *love* love and and *lowte*. bow 'Ah, Lord, I love and venerate you.' (York plays, 9, 189)
	- c. And and so so I I schall shall fulfille fulfil / That that I I before before *haue* have highte. promised 'And so I shall fulfil what I have promised before.' (York plays, 37, 396)
	- b. A, A sir, sir, a a blynde blind man man am am *I* I / And and ay always *has* has bene been of of tendyr tender yoere year Sen since I I was was borne. borne.

'Ah, Sir, I am a blind man and always have been of tender year since I was born.'

(York plays, 25, 297)

c. *I* I here hear the thee lorde lord and and *seys* sees the thee nought. not 'I hear the Lord and do not see you.' (York plays, 5, 139)

This finding corroborates the findings of Fernández-Cuesta (2011). In her study of the *LAEME* data only two non-adjacent 1sg verbs occur with an -*s* ending (of six unambiguous cases of non-adjacency).

<sup>24</sup>Examples taken from Davidson's (2011) edition of the *York plays* are referenced in the format "play number, line". All other examples are taken from the edition by Beadle (1982).

### 9 The Northern subject rule revisited

We will now take a closer look at 2sg forms, which present a set of interesting properties that are directly relevant for the analysis of the agreement system and the type of NSR found in the *York plays*. The following discussion is based on a data set of 852 clauses with 2sg subjects that we extracted from Davidson (2011)'s edition of the *York plays*. With 2sg subjects (variants of the "old" form *thou*), *-s* is the dominant ending in the present tense (indicative), independently of whether the subject is adjacent to the verb or not (in general, non-adjacency between subject pronoun and finite verb is much less frequently found in the corpus than adjacency). In other words, there are no clear NSR effects in the context of 2sg. Alternative forms of the *-s* inflection include markers extended by *-t(e)* (particularly frequent with forms of 'have', e.g. *hast(e)*, see Table 9.5),<sup>25</sup> and by pre-consonantic vowels (*-es*, *-is*, *-ys*). See Table 9.5 for the quantitative distribution of the 2sg endings with lexical verbs and (12–13) for a selection of relevant examples.

The subject and the finite verb are adjacent:

	- b. *Heris* hers *thou* thou not not what what I I saie say thee? thee 'Don't you hear what I say to you?' (York plays, 31, 317)
	- c. *Thou* thou *makist* makest her her herte heart full full sare sore 'You make her heart fully sore.' (York plays, 13, 251)

(i) And sen thou *dose* not as I thee tell, (York plays, 22, 169).

<sup>25</sup>Apart from verbs that are made up by only a single CV-pattern (e.g. *se* 'see'), we have counted here all verbs ending in *-e*, including forms such as *come*, *take* etc. There are seven instances (all under adjacency of subject and verb) where *-e* attaches to the *s*-ending as in (i).

These are counted as instances of *-s*. In addition, there are four examples where the enlarged ending *-st* combines with *-e* (e.g. *saiste* 'say-2sg', 30, 477). Modals such as 'can', 'must', 'shall' always appear without *-s* (due to their origin as preterite-presents) and are therefore not considered here (there are a few instances of *moste* 'must-2sg', though).

### Eric Fuß & Carola Trips

d. Fro from thens thence *come* come *thou*, thou Lorde, lorde as as I I gesse guess 'From thence thou come, Lord, as I guess.' (York plays, 21, 114)

The subject and the finite verb are not adjacent:

	- b. *Thou* thou arte are combered troubled in in curstnesse cursedness / and and *caris* cares to to this this coste. cost 'You are troubled with sin and dread this price.' (York plays, 26, 171)

At first sight, it appears that the use of reduced markers (zero or *-e*) is also widespread. However, upon closer inspection it turns out that the vast majority of reduced endings represent subjunctive or imperative/optative forms (as illustrated in 14). The latter are conspicuously frequent, which can be attributed to the religious character of the plays, which include many prayers, or passages where the characters directly address Jesus or God. If subjunctive (and adhortative/optative) forms are filtered out, it appears that around 80% of 2sg lexical verbs carry some form of the *s*-inflection in the present tense indicative; see Table 9.5 for a summary of our quantitative findings.<sup>26</sup> Furthermore, it turns out that of the 31 cases with *-e* 16 are forms of the preterite-present verb *witen* 'know' that usually does not inflect for 2sg. That is, the share of *s*-marked forms is probably even larger than 80%.

(14) a. Luk look nowe now that that *thou* thou *wirke* act noght not wrang wrong ... 'Look now, that you do not wrong.' (York plays, s439)

<sup>26</sup>In quite a number of cases it is hard to tell whether we are dealing with a subjunctive or indicative form. This seems to support the hypothesis (cf. e.g. Sweet 1871) that the spread of the reduced ending involved a reanalysis of originally subjunctive forms as indicative (most likely in subordinate clauses).

### 9 The Northern subject rule revisited

b. Iff if I I haue have fastid fasted oute out of of skill, skill *Wytte* know *thou* you me me hungris hungers not not so so ill ill 'If I have fasted unreasonably, you should know that I'm not so hungry.' (York plays, s1962)


Table 9.5: Verbal endings of the second person present tense indicative in the *York plays* (lexical verbs only)

In what follows, we take a closer look at the behaviour of the auxiliaries 'have' and 'be'. As shown in Table 9.6, variants of the *s*-ending (especially *-st*) are particularly frequent with 'have' in its use as a perfect tense auxiliary (almost obligatory, in fact).<sup>27</sup>

Table 9.6: Verbal endings of the second person perfect auxiliary 'have' in the *York plays*


So it appears that forms ending in *-s/st* are highly grammaticalized as realizations of the 2sg perfect auxiliary 'have'. Furthermore, note that the extended

<sup>27</sup>In connection with the perfect auxiliary 'have' 2sg *st*-forms are frequently extended with *e*: In cases where the subject is adjacent to the verb, we have found 9 instances of *haste* in inversion contexts, and 11 instances of *haste* without inversion.

### Eric Fuß & Carola Trips

2sg marker *-st* has been better preserved in connection with 'have', which can presumably be attributed to the fact that auxiliary 'have' is a highly frequent element. A similar frequency-related preservative effect can be observed with 2sg forms of 'be', albeit with a different effect on the distribution of *s*-marked forms, as illustrated in Table 9.7.


Table 9.7: 2sg forms of the auxiliary 'be' (present tense) in the *York plays*

'Be' differs significantly from the other verbs surveyed so far, and its special behaviour is of particular theoretical interest, as will become clear shortly. First and foremost, the *s*-marked form *is* (which is also standardly used in connection with all kinds of 3sg subjects) is quite rare;<sup>28</sup> in around 90% of all cases, the 2sg of 'be' is realized by a variant of *art*, with the extended form *arte* being twice as frequent as the short alternative. Again, the fact that the suppletive form of 2sg 'be' has been preserved in the *York plays* can be attributed to the high token frequency of *art(e)*, which in this case has blocked the spreading of the *s*-marked alternative *is*. However, *art(e)* seems to be confined to contexts where the subject is adjacent to the finite auxiliary. In any case, the absence of *art(e)* in non-adjacent contexts seems to be noteworthy. It might well be that non-adjacent instances of *art(e)* are simply by chance absent from the records (recall that there is a strong tendency for pronominal subjects to be adjacent to the verb). Moreover, examples like (15) suggest that the use of *is* is not necessarily a reflex of theNSR in 2sg contexts, since *is* is used both under adjacency and non-adjacency with the subject pronoun.<sup>29</sup>

<sup>28</sup>The *s*-ending also appears on preterite forms of 'be' (*was*).

<sup>29</sup>Note that despite appearances, cases like (i) and (ii) are not to the point, since both *arte* and *haste* as well as *arte* and *caris* are the regular (fully inflected) 2sg forms of the relevant verbs.

<sup>(</sup>i) Why, why *arte* are thou thou a a pilgryme pilgrim and and *haste* has bene been at at Jerusalem Jerusalem 'Why, are you a pilgrim who has been in Jerusalem?' (York plays, 40, 70)

9 The Northern subject rule revisited

(15) For for thou thou *is* is one one and and *is* is abill able and and aught ought to to be be nere. near (York plays, 32, 33)

3sg subjects always trigger *s*-forms (V+*s*, *has*, *is*); there is no trace of the NSR, that is, type of subject and position of the subject relative to the verb do not matter, as shown in (16):

	- b. Here Hear sirs sirs howe how *he* he *sais*, says and and *has* has forsaken forsaken His his maistir master to to this this woman woman here here twyes twice

'Hear sirs, what he says and how he has betrayed his master twice with this woman here.' (York plays, s2793)

As already briefly mentioned above, the effects of the NSR can be most readily observed with plural (pronominal) subjects. While nominal 3pl subjects usually require *s*-marking in the present tense, as shown in (17), the verb appears in its bare form when the subject is a pronoun adjacent to the verb. This is illustrated in (18).

	- b. To To mischeue harm hym, him with with malis malice in in there their mynde mind haue have thei they menyd, meant And and to to accuse accuse hym him of of cursednesse sinfulness *the* the *caistiffis* captives *has* has caste. uttered. 'To harm him with malice in their mind they complained and to accuse him of sinfulness the captives have uttered.' (York plays, s5243)

<sup>(</sup>ii) Thou thou *arte* are combered troubled in in curstnesse cursedness / and and *caris* cares to to this this coste. cost 'You are troubled with sin and dread this price.' (York plays, 26, 171)

### Eric Fuß & Carola Trips

	- b. Therfore Therefore some some of of my my peyne pain *ye* you[2pl] *taste* taste / And and *spekis* speaks now now nowhare nowhere my my worde word waste, waste 'Therefore some of my pain you taste and speak now nowhere my

word waste.'

(York plays, 41, 87)

c. Howe How these this folke folk spekes speaks of of our our chylde. child. *They* They *say* say *and* and *tells* tells of of great great maistry

authority. 'How this folk speaks of our child. They say and tell of great authority.' (York plays, 15, 79)

In traditional descriptions of the inflectional system of the *York plays* it is sometimes taken for granted that NSR effects as in (18) are the norm with plural subject pronouns that are not adjacent to the verb (cf. e.g. Burrow & Turville-Petre 2005: 272). However, it seems that the agreement system is more variable. For example, there are also cases where the verb fails to be adjacent to the subject and still lacks *s*-marking as illustrated in (19).<sup>30</sup>

<sup>30</sup>In general, cases where pronouns are not adjacent to the verb are quite rare. It is therefore difficult to estimate the status of patterns such as (18) and (19). One might speculate that in at least some of those cases, the zero ending is used to facilitate rhyming as in (19c). Alternatively, cases of zero marking under non-adjacency might be taken to foreshadow the loss of the position-of-subject constraint, eventually leading to general verbal zero marking with

9 The Northern subject rule revisited

	- this day

'Wherefore we go on our way and make offerings to God this day.' (York plays, 17, 228)

b. Yhe You comaunded commanded me me to to care, care als as *ye* you[2pl] kenne know wele well and and *knawe*, know To to Jerusalem on a journay, with seele;

Jerusalem on a journey with good-fortune

'You commanded me to come, as you well understand and know, to Jerusalem on a journey, with good fortune.' (York plays, 30, 336)

c. That that lurdayne rascal that that *thei* they loue love and and *lowte* venerate / To to wildirnesse wilderness he he is is wente gone owte out 'That rascal that they love and venerate he went out to the wilderness.' (York plays, 22, 32)

Particularly interesting in this regard is the behaviour of the plural of 'be', which is realized by variants of *are*. It turns out that independently of the category (nominal/pronominal) and the position of the subject (adjacent/non-adjacent to the verb), the plural form of 'be' is almost always *are*, that is, forms of 'be' are usually not subject to the NSR.<sup>31</sup> The different behaviour of 'be' and lexical verbs is illustrated by the examples in (20).

(i) Thei they that that *is* is comen come of of my my kynde kind [...] (York plays, 44, 128)

pronominal subjects (as in many present-day dialects). The extension of the zero marker could then perhaps be analysed as an analogical change made available by the overall rarity of cases where the pronoun fails to be adjacent to the verb.

<sup>31</sup>It should be pointed out, however, that there are few examples, such as (i), where NSR effects do show up with non-adjacent forms of 'be'. At least with plural subjects, these are vastly outnumbered by cases where the regular plural form *are* appears under non-adjacency.

### Eric Fuß & Carola Trips

	- b. Men men that that *are* are stedde placed stiffely unwavering in in stormes storms or or in in see sea And And *are* are in in will will wittirly fully my my worschippe worship to to awake, awake And And thanne then *nevenes* call my my name name in in that that nede, need

'Men who are unwavering in storms or at sea and are fully in will to awake my worship and then call my name in that need.' (York plays, 44, 137–139)

c. All All that that *are* are in in newe harm or or in in nede need and and *nevenes* call me me be by name, name 'All of them are in harm or in need and call me by my name.' (York plays, 44, 144)

So there is a major difference between the plural forms of lexical verbs and 'be': With lexical verbs, nominal subjects differ from pronominal subjects in that the former always trigger *s*-marking on the verb (both in the singular and the plural), while the latter take part in the NSR. With 'be', however, nominal and pronominal subjects behave alike: Singular forms trigger *is*, while plural subjects invariably trigger *are*. In the following section, we will discuss the theoretical relevance of this asymmetry. We would also like to point out that 1sg forms seem to play a special role in that they are by and large (see above for some exceptions) exempt from the NSR, in contrast to the system listed in Table 9.2. Table 9.8 summarises our findings regarding the inventory of inflectional endings found with present tense verbs in the *York plays* ("pron." stands for "pronoun", "adjac." stands for "adjacent"; recall that "-∅" is a shortcut for zero marking and forms that end in *-e*).

## **4 The NSR in the** *York plays***: Towards an analysis**

An adequate analysis of the type of NSR as exhibited by the *York plays* should capture the following basic system-defining characteristics: (i) the effect of subject type/position of the subject on verbal agreement marking; (ii) the fact that apart from some minor exceptions (which probably reflect differences in authorship,


Table 9.8: Verbal inflection in the *York plays* (present tense indicative)

### Eric Fuß & Carola Trips

or language change in progress), the NSR seems to be confined to plural forms; (iii) the observed differences between 'be' and other verbs (only 'be' signals regular number agreement independently of type and position of the subject). In what follows, we will present a syntactic analysis of these findings that makes use of the notion that inflectional heads may lack phi-content when they enter the syntactic derivation, which Roberts (2010) calls "blank generation". The basic idea is that the absence of agreement features on the T-head may be repaired in different ways, either via insertion of default inflection (i.e., *-s* in many NSR varieties), or by incorporation of adjacent subject pronouns, leading to the presence of phi-features on T, which can then be spelled out by (marked/more specified) zero agreement.

However, before we turn to the specifics of that approach to the NSR, we would like to discuss in some more detail a set of morphological aspects pertaining to the agreement system as found in the *York plays*, including the inventory of markers and their featural specifications (see §4.2 for the question of how richness of inflection might be linked to the featural content of the relevant underlying inflectional heads in the syntax).

### **4.1 Morphological aspects**

The *York plays* exhibit a mixed system, where the NSR is more or less confined to the plural part of the paradigm (with some few exceptions with 1sg) and has not yet spread to 'be'. In the inventory of present tense markers we still find 2sg forms extended by *t*, similar to earlier stages of English. The extended forms are rare with lexical verbs, but are the dominant pattern with auxiliary verbs (*hast(e)*, and in particular *art(e)*). With auxiliaries, they serve to preserve the distinction between 2sg and plural (and 3sg) forms, which is blurred with lexical verbs (due to the loss of final *t* in the 2sg).<sup>32</sup> The evidence for distinctive 2sg forms provided by auxiliaries precludes the development of a general impoverishment rule suggested above (here repeated in 21), which leads to system-wide syncretism of 2sg and 3sg forms (in varieties that have preserved *thou*).

(21) [+hearer] → ∅ / \_\_ pronoun[NOM]

To capture the fact that syncretism of 2sg and 3sg is confined to lexical verbs, we propose the following slightly modified version of (21), which applies only

<sup>32</sup>Note that it is not entirely clear whether the 2sg forms extended by *t* represent a retention or are the result of dialect contact (e.g., the MED lists *hæfes* as the 2sg of 'have' in Northumbrian OE).

### 9 The Northern subject rule revisited

to lexical verbs and deletes the verbal agreement feature [+hearer] when the finite verb is adjacent to a 2sg subject pronoun. As a result of (22), finite verbs that agree with 2sg subjects in the syntax lack positive values for [person] and [number] at the point of vocabulary insertion (assuming a realisational model of grammar, where phonological exponents of abstract morphosyntactic features are inserted postsyntactically, cf. e.g. Halle & Marantz 1993).<sup>33</sup>

(22) [+hearer] → ∅ / <sup>V</sup>\_\_ pronoun[NOM]

The system of present tense indicative markers for lexical verbs can thus be described by basically the same set of vocabulary items that we posited for the system in Table 9.2 (following standard assumptions, more specified exponents/ markers take precedence over less specified exponents due to the elsewhere condition, Kiparsky 1973):

(23) a. [+phi] ↔ -∅ b. elsewhere ↔ *-s*

After deletion of [+hearer] (due to the impoverishment rule in 22), both 2sg and 3sg forms are spelled out by the default inflection *-s* (recall that we assume that "3sg" corresponds to the absence of (positive) specifications for [person] and [number]). In this way, (22), in combination with the inventory of agreement markers in (23), accounts for the lack of NSR effects with 2sg (and 3sg) subjects.

A slightly different set of vocabulary items is used for present tense indicative forms of 'have'. We take it that the extended form *hast(e)* still signals 2sg. To account for the fact that *hast(e)* covaries with the reduced and ambiguous form *has*, we assume that the same feature set can be spelled out by *has* (probably as a result of phonological erosion (reduction of the final consonant cluster *st*), which happens to be homophonous with the elsewhere marker.

	- c. elsewhere ↔ *has*

The present tense paradigm of 'be' has preserved even more distinctions (three persons in the singular, and the distinctive plural form *are*). Moreover, NSR effects are virtually non-existent with 'be',<sup>34</sup> and it is the only verb that exhibits

<sup>33</sup>Alternatively, one might assume that the *-s*-marker found with 2sg lexical verbs is still a genuine 2sg form, which only happens to be accidentally homophonous with the default *-s* found in other contexts (i.e., in the 3sg and plural).

<sup>34</sup>Recall that there are very few instances where *is* occurs with (non-adjacent) 1sg and 2sg subjects.

### Eric Fuß & Carola Trips

proper number agreement with nominal subjects. The inventory can thus be described as follows:

	- b. [+hearer, −pl] ↔ *art(e)*
	- c. [+pl] ↔ *are*
	- d. elsewhere ↔ *is*

Note that the inventory in (23–25) single out *s*-marked forms as the elsewhere case. In the next subsection, we will address the question of why *s*-marked forms gain a wider distribution in contexts where the verb fails to be adjacent to a pronominal subject. In addition, we will argue that the absence of NSR effects in connection with 1sg (in contrast to 2sg and 3sg) and all forms of 'be' cannot be attributed to morphological properties, i.e., the inventory of vocabulary items plus impoverishment, and should thus receive a syntactic explanation.<sup>35</sup>

### **4.2 Syntactic aspects**

In this section, we will present an analysis of the agreement system displayed by the *York plays* that is based on Roberts's (2010) proposal that functional heads may enter the syntactic derivation without featural content (so-called "blank generation"). We will argue that a slightly modified version of Roberts's approach to the NSR provides enough leeway to account for the mixed or hybrid character of the agreement system found in the *York plays* (in particular, the special behaviour of 'be'), in contrast to previous theoretical analyses. We take it that the lack of NSR effects with 'be' and 1sg subjects reflects a genuine syntactic difference and should not be captured by purely post-syntactic/morpho-phonological mechanisms (in contrast to what we have proposed for the absence of relevant effects with 2sg and 3sg). More precisely, the facts suggest that in these cases, subjectverb agreement is established by a syntactic operation (e.g., Agree; Chomsky 2000) that leads to feature matching between the phi-content of a relevant functional head (T/INFL) and the subject, independently of type and position of the latter.

<sup>35</sup>An anonymous reviewer raised the question whether the asymmetry between 'be' and other verbs could not simply be analysed as a lexical difference, in the sense that the paradigm of inflected forms of 'be' is richer than the paradigms of other verbs. However, a lexical solution fails to account for the fact that the difference between 'be' and lexical verbs is syntactic in nature: With lexical verbs, the agreement alternation (that is, the NSR) is governed by syntactic factors (type and position of the subject), while no such effects are observed with 'be'.

### 9 The Northern subject rule revisited

Roberts (2010) outlines an analysis of the NSR that is based on the idea that in the relevant varieties, T/INFL lacks a phi-set of its own (blank generation). As a result, T/INFL enters the syntactic derivation without agreement features. it can only acquire such features via incorporation of (clitic) subject pronouns.<sup>36</sup> The presence of (positively specified) agreement features in T/INFL (resulting from the incorporation of clitic pronouns) is then signalled by zero marking on the verb, while *-s* is inserted as a default inflection when T/INFL lacks agreement features (cf. 23).<sup>37</sup> To account for the adjacency effect, Roberts assumes that incorporation must go hand in hand with phonological cliticisation of the subject pronoun to the verb.<sup>38</sup> In other words, a T/INFL head without an inherent phiset may acquire agreement features in the course of the derivation when the conditions in (26) are met.

	- b. phonological cliticisation: (pronoun X V) (where X is null or another clitic)

This account provides a straightforward description of "pure" NSR systems similar to the one given in Table 9.2 where all verbs (including auxiliaries) take

(i) That that we we *hym* him tharne lose sore sure may may vs us rewe, regret 'We will certainly regret that we lost him.' (York plays, 42, 14)

<sup>36</sup>A related, but purely post-syntactic, analysis of the NSR is proposed by Trips & Fuß 2010, who posit the following agreement rule that operates on the output of the syntactic derivation:

<sup>(</sup>i) *-*∅ marks the presence of positive specifications for [person] or [number] in the minimal phonological domain the finite verb is part of; *-s* is inserted elsewhere.

Similar to an approach in terms of blank generation, (i) assumes that the relevant agreement features are provided by weak subject pronouns under adjacency with the verb. However, notice that the special behaviour of 'be' seems to call for a (partially) syntactic treatment of subject-verb agreement in the *York plays*. See below for further discussion and a synthesis of the two accounts.

<sup>37</sup>Recall that we assume that '3sg' corresponds to the absence of (positively specified) person and number features, cf. e.g. Harley & Ritter (2002).

<sup>38</sup>Interestingly, it seems that the only elements that may regularly intervene between a subject pronoun and a zero-marked verb are (weak) object pronouns as in (i).

This can be accounted for if we assume that both the subject and object pronoun are part of a clitic cluster that attaches to the verb.

### Eric Fuß & Carola Trips

the marked (zero) ending only in connection with adjacent non-3sg (clitic) pronouns, while *-s* occurs elsewhere. In addition to the adjacency condition, Roberts' analysis also correctly predicts that stressed, coordinated and modified forms (which are not clitics) trigger default inflection on the verb:

	- b. Him and me *drinks* nought but water. (Roberts 2010: 6)
	- c. Us students *is* going. (Belfast English; Henry 1995: 24)

However, something more must be said to capture (a) the fact that the pronoun's phi-set is spelled out twice (as the pronoun itself and as zero marking on the verb), and (b) the observation that in many NSR dialects, the marked zero inflection also appears in inversion contexts, where the finite verb precedes an adjacent subject pronoun:

(28) So so sir, sir *slepe* sleep ye, you.pl and and *saies* say no no more. more (York plays, 30, 148)

Under the assumption that incorporation of the pronoun is a purely syntactic process, the fact that it may precede (compare 18) or follow the zero-marked verb (as in examples like 28) does not seem to receive a satisfying explanation. If incorporation is analysed as an instance of head movement, we would expect that the relative order of pronoun and finite verb is not variable. As a possible solution, one might suggest that the linearisation of the incorporated pronoun is sensitive to the syntactic position of the finite verb in the sense of a second position/Wackernagel effect that is only triggered when the verb has moved to C 0 . However, such an account would be quite stipulative. In what follows, we would like to argue that a more principled explanation becomes available if we take a closer look at the nature and cause of the assumed incorporation process. What we would like to propose is that in the NSR varieties, incorporation of the pronoun is in fact a postsyntactic repair operation that is triggered to patch up a T head that enters the morpho-phonological component without phicontent. The rationale behind this idea is that in a language with at least some morphological agreement, a phi-less T-head creates a problem at the interface to the morpho-phonological component.<sup>39</sup> This problem can be repaired either by

<sup>39</sup>Arguably, no such repair is needed in languages that completely lack agreement features (e.g., Indonesian).

### 9 The Northern subject rule revisited

the insertion of default inflection (a last resort prior or during vocabulary insertion), or by "incorporation" of an adjacent phi-set that can then be spelled out by an appropriately marked agreement formative. The latter option is arguably more specific/complex and therefore preempts repair via default inflection (due to the elsewhere condition). To account for the fact that both the pronoun and the phi-set on T are spelled out (the latter usually via the zero marker in the NSR varieties), we assume that the pronoun's phi set is copied onto the finite verb/T under adjacency (i.e. when both elements are part of the same minimal prosodic domain). Crucially, this repair operation (giving rise to zero inflection) can apply in both inversion and non-inversion contexts as long as the pronoun is directly adjacent to the finite verb (note that this modification of Roberts' original account combines the idea of blank generation with certain aspects of the postsyntactic approach proposed by Trips & Fuß 2010, cf. footnote 37).

Some additional tweaking is needed to account for the intricacies of the version of the NSR that is found in the *York plays*. First of all, it is evident that in contrast to other verbs, 'be' cannot be subject to blank generation. Rather, 'be' is the phonetic realization of a special T/INFL node that comes with its own phi-features (in contrast to T/INFL linked to other verbs). As a result, 'be' may agree with non-pronominal subjects as well. Note that the special behaviour of 'be' is a major challenge for theoretical approaches that analyse agreement/nonagreement as the result of different subject positions (as e.g. de Haas & van Kemenade 2015). The fact that regular number agreement occurs with nominal subjects (which otherwise do not trigger agreement) shows that the structural position of the subject is not relevant. Rather, it seems that 'be' (in contrast to other verbs) can detect the phi-features of any kind of subject (independent of its position and categorial nature) due to the fact that Tbe always carries an unvalued set of phi-features that triggers a syntactic Agree operation. Thus, we take the asymmetry between 'be' and other verbs to suggest that blank generation may be parameterized so that it affects only certain types of inflectional heads.

Basically the same approach can be used to account for the absence of NSR effects with 1sg subjects. Again, we assume that T is not subject to blank generation in this case. Of course, this raises the more general question of how and why blank generation of inflectional heads is triggered. What we would like to propose is that the absence of agreement features on T is intimately linked to the breakdown of the (morphological) agreement system in Northern Old/Middle English. Recall that as a result of phonological erosion (and probably language contact with Scandinavian), *-s* (or rather, variants of it) became the only overt agreement marker in Northern varieties, eventually leading to a binary agreement system that does not any longer signal featural distinctions apart from [+/−phi]. We

### Eric Fuß & Carola Trips

take it that this is the prototypical situation that brings about wholesale "blank generation" of T/INFL.<sup>40</sup> In the *York plays*, however, we still find a slightly richer system of endings. In addition to the fact that 'be' has preserved more inflectional distinctions than other verbs (including a systematic distinction between 2sg and 2pl), the zero ending is still closely linked to 1sg, in that it unambiguously signals [+speaker, −plural] with singular subjects and in cases where the subject fails to be adjacent to the verb (presumably reflecting an earlier pre-NSR stage where 1sg was the only feature combination that was clearly marked on the verb, by zero marking; cf. Table 9.2). It seems thus plausible to assume that blank generation of T is blocked in contexts where agreement marking can still be linked to featural distinctions that are more specific than a binary [+/−phi] contrast. Our approach to the NSR in the *York plays* is summarized in (29):

	- b. *no NSR effects/'be' & 1sg*: no blank generation of T, regular syntactic agreement;
	- c. *no NSR effects/2sg & 3sg*: impoverishment and underspecification of markers (→ *-s*).

This approach captures basic properties of the agreement system exhibited by the *York plays*. However, note that in addition to these general patterns, we have also observed a number of alternative agreement options. Some of these are presumably residues of a former system (such as the few cases of 2sg *-st* on lexical verbs), while others represent innovations that compete with some of the options in (29), such as NSR effects in connection with 1sg (which can perhaps be analyzed as extensions of blank generation to 1sg contexts), and cases where the position of subject constraint seems to be neutralized, leading to general zero marking with pronominal subjects (which foreshadows a development that has

<sup>40</sup>On a more technical note, one might assume that blank generation of T/INFL results from another type of impoverishment rule that deletes person and number features from T/INFL before the latter enters the syntactic derivation (cf. e.g. Müller 2006 on the notion that impoverishment rules may also operate presyntactically):

<sup>(</sup>i) [Person, Number] → ∅ / T\_\_

However, note that such an approach raises a number of questions concerning the interplay between presyntactic and postsyntactic impoverishment that we cannot discuss here. We leave this issue for future research.

### 9 The Northern subject rule revisited

taken place in a number of NSR dialects).<sup>41</sup> The existence of this type of linguistic variation suggests that the particular version of the NSR that is found in the *York plays* represents an intermediate stage that eventually gave way to a more balanced agreement system where blank generation of T/INFL is not (lexically) confined to certain contexts.

## **5 Some remarks on the historical origin of the NSR**

So far, we have presented a theoretical analysis of the NSR in terms of "blank generation" of inflectional heads. From a diachronic point of view, we have seen that in OE, special inversion contexts show an unexpected -*e* affix which can be interpreted as foreshadowing the NSR, and that this rule actually occurred in some ME texts. In this section, we will bring these observations together and argue that after the breakdown of the OE agreement system, the NSR developed via a combination of generalized V2 in the northern varieties and agreement weakening in inversion contexts (which turned into the NSR after the loss of V2).<sup>42</sup>

The starting point for our diachronic analysis is Northumbrian OE, where only 1sg is unambiguously marked by verbal agreement (via *-e*/∅). Elsewhere, we find some form of *-s* marking, which alternates with the dental markers in 3sg contexts and in the plural part of the paradigm. The question then is how and why new zero markers were introduced into the northern paradigm. We believe that the rise of new zero-marked plural forms is closely related to the phenomenon of agreement weakening in OE. Following Roberts (1996), we analyze OE agreement weakening in terms of contextual allomorphy of 1pl/2pl forms which can be attributed to syntactic factors, namely the structural position of the finite verb (similar to complementizer agreement in present-day West Germanic dialects): (i) The reduced form is used only when the verb moves to C (in contexts with fronted operators such as *wh*, negation etc.). In contrast, full agreement obtains in all other contexts, where the verb occupies a lower inflectional head (Infl/T) (cf. e.g. Cardinaletti & Roberts 2002; Pintzuk 1999; Hulk & van Kemenade 1995; Kroch & Taylor 1997; Haeberli 1999; Fischer et al. 2000, and many others). As a result, agreement weakening is confined to inversion contexts where the finite

<sup>41</sup>A fuller description and quantitative analysis of the agreement options in the *York plays* is beyond the scope of this paper. We leave it for future investigation.

<sup>42</sup>Some authors (cf. Hamp 1976; Klemola 2000; Filppula et al. 2002; de Haas 2008) have claimed that the rise of the NSR was promoted by language contact with the Brythonic Celtic languages, which exhibit a similar distinction between pronouns and non-pronouns. See e.g. Pietsch (2005a), de Haas (2011) and Benskin (2011) for critical discussion.

### Eric Fuß & Carola Trips

verb immediately precedes a 1pl/2pl subject pronoun. In the other cases where the finite verb is in a lower position we find regular agreement with both subject pronouns and full subject DPs. This is illustrated with the following structures:

	- a. [CP Op [<sup>C</sup> ′ C+Vfin [TP subj.pron. [<sup>T</sup> ′ T [VP … ]]]]] → agreement weakening
	- b. [CP XP [<sup>C</sup> ′ C [TP [<sup>T</sup> ′ T+Vfin [VP DP subject …]]]]] → regular agreement
	- c. [CP XP [<sup>C</sup> ′ C [TP subj.pron. [<sup>T</sup> ′ T+Vfin [VP …]]]]] → regular agreement

The evidence available suggests that this kind of systematic (syntactic) agreement weakening was originally confined to southern varieties of OE, while northern texts show only occasional examples of reduced agreement endings (i.e., schwa or -∅) in inversion contexts (cf. e.g. Berndt 1956; Cole 2014 on Northumbrian OE). In other words, it does not seem to be possible to analyze the NSR as a direct continuation of OE agreement weakening (but recall that Northumbrian OE exhibits a related pattern where the *s*-marker appears under adjacency with a subject pronoun). However, it seems likely that the agreement patterns that eventually turned into the NSR entered northern grammars via dialect contact with southern varieties (cf. Pietsch 2005b: 53f. for discussion). In the northern varieties the original OE pattern shown in (30) was then generalized to all contexts with adjacent plural subject pronouns (cf. Rodeffer 1903; Pietsch 2005b).<sup>43</sup> But why did this only happen in the northern varieties? To answer this question, let us take a closer look at grammatical factors that shaped the impact of dialect contact and possibly led to the rise of the NSR in the northern varieties. It has been claimed by a number of authors (cf. e.g. Kroch & Taylor 1997; Trips 2002) that there are major syntactic differences between northern and southern early ME varieties.<sup>44</sup> In particular, the northern varieties had developed generalized V2 which means that the finite verb consistently occurred in C regardless of the

<sup>43</sup>Rodeffer's proposal is criticized by Berndt (1956), who argues that quantitative data from Northumbrian OE texts indicate that there is no direct link between agreement weakening in OE and the NSR (more precisely, Berndt argues that the evidence available to us suggests that agreement weakening had already been in decline in the northern varieties before *-s* was generalized to all persons and numbers; see Pietsch 2005b: 50ff. for comprehensive discussion and a critical assessment of Berndt's arguments).

<sup>44</sup>Moreover, the NSR could not have developed in the southern varieties for purely morphological reasons: the loss of plural /-n/ in the ME period served to neutralize the contrast between full and syncopated forms formerly introduced by OE Agr-weakening.

### 9 The Northern subject rule revisited

nature of the initial constituent. As a result of this change, the syntactic differences between subject pronouns and phrasal subjects seem to be less clear-cut than in OE (the only remaining diagnostic is the placement of the subject relative to certain high adverbs, cf. de Haas 2011, de Haas & van Kemenade 2015 for details):

(31) a. [CP XP [<sup>C</sup> ′ C+Vfin [TP subject [<sup>T</sup> ′ T [VP …]]]]] b. [CP subject [<sup>C</sup> ′ C+Vfin [TP tsubj [<sup>T</sup> ′ T [VP … ]]]]]

So as soon as the northern learners were confronted with southern agreement weakening, they could neither attribute it to a special position of the verb (due to generalized V2) nor, arguably, to a special position for subject pronouns since the evidence for differential subject positions had become blurred. What we would like to propose is that, at this point, learners did not discard the pattern (presumably because it was too robustly attested in the input), but rather reanalysed it in terms of a structure where the radically impoverished inflectional head was endowed with phi-features via incorporation of the subject pronoun. This gave rise to an early version of the NSR that initially distinguished between 1pl/2pl pronouns and all other subjects. The reanalysis of southern agreement weakening as incorporation of subject clitics led to the loss of syntactic restrictions on the distribution of reduced endings, and agreement weakening could be extended to all contexts with adjacent subject pronouns (VS and SV). The result was that the syncopated 1pl/2pl forms were not any longer confined to operator contexts, which widened the scope of agreement weakening to all 1pl/2pl contexts, including preverbal pronouns in both main and embedded clauses:

(32) … *we* we *go-*∅ go by by trouthe, truth noghte not by by syghte, sight þat that es, is *we* we *lyff-*∅ live in in trouthe, truth noghte not in in bodily bodily felynge; feeling (ROLLTR,36.752)

A further result was that the rule was extended to 3pl contexts:

(33) … þe the penance penance þat that *þai* they *suffer* suffer … (ROLLEP, 86.368)

This extension can possibly be attributed to the fact that in the Northern ME varieties the original OE 3pl pronoun *hio/heo* was replaced by the Scandinavian

### Eric Fuß & Carola Trips

form *ðai* (which later spread to all varieties). In inversion contexts, this innovation led to cluster reduction of [s + ð] to [ð] for phonetic reasons (which was possibly promoted by analogical pressure, 1pl/2pl, cf. Pietsch 2005a: 56).

A closer look at morphological aspects of this change reveals that we can indeed talk about a "markedness reversal" (Pietsch 2005a) since the "weak" syncopated southern OE forms turned into the marked inflections in the NSR dialects. When the zero affix entered the northern grammars via dialect contact with the southern varieties, it was pressed into service as a marked agreement formative on the model of the zero inflection that occurred with 1sg subjects. The observation that NSR effects appeared first in connection with lexical verbs is perhaps related to the fact that the underspecified *s*-marker had already gained a wider distribution here, which facilitated a reinterpretation of the zero inflection as a marked agreement formative that contrasted with default *-s*.

After the initial reanalysis, independent changes led to the extension of the zero affix first from 1pl/2pl to 3pl, then to 1sg and – in some varieties – 2sg, when the former 2pl *you* replaced the original 2sg form *thou*. Note that the latter changes led to a more balanced and less complex agreement system combining general "blank generation" of T/INFL with a binary inventory of agreement markers ([+phi] ∅ vs. [−phi] *-s*).<sup>45</sup> The evidence from the *York plays* suggests that the development of this system, which corresponds to Table 9.1, proceeded via a set of intermediate stages where blank generation of T/INFL was restricted to certain verbs or verb classes and parts of the verbal paradigm that had ceased to show distinctive agreement marking.

## **6 Conclusions**

In this paper, we have discussed a set of open questions concerning the synchronic analysis and diachronic development of the NSR in northern varieties of English. We have presented a set of new data from the Northern ME *York plays*, which exhibit an early stage of the NSR where its effects are confined to plural forms of lexical verbs and 'have', while 'be' shows regular number agreement with all kinds of subjects. We have argued that the agreement system found in the *York plays* suggests a theoretical analysis of the NSR in which inflectional heads enter the syntactic derivation without a phi-set (due to pre-syntactic impoverishment leading to "blank generation", Roberts 2010) and acquire agreement features

<sup>45</sup>See Fuß (2010) for an analysis of relevant analogical changes in terms of a learning strategy that favours a minimal inventory of inflectional markers/features (based on the notion of *minimize feature content*, Halle 1997).

([person] and [number]) via the incorporation of clitic subject pronouns. Heads that have been endowed with positive specifications for [person] and/or [number] in the course of the syntactic derivation are spelled out by the zero marker. Elswehere, the underspecified form *-s* is used. Based on this account, we have then suggested a new scenario for the historical development of the NSR, arguing that, after the breakdown of the OE agreement system, the NSR developed via dialect contact between northern and southern varieties. More precisely, we have proposed that syncopated verb forms (resulting from Agr-weakening in the southern dialect) were integrated into the northern grammar as marked agreement formatives that contrasted with *-s*. We have linked the rise of the NSR to the interplay of a set of morphosyntactic properties of Northern ME (including generalized V2 and the advanced loss of inflections), which made available a reanalysis where southern Agr-weakening was attributed to syntactic incorporation of subject pronouns, which supplied a radically impoverished T/INFL-head with agreement features. This contact-induced change paved the way for an extension of the NSR to adjacent pronouns more generally, including preverbal and singular forms.

# **Abbreviations**


## **Acknowledgements**

We would like to use the opportunity to express our gratitude and indebtedness to Ian Roberts for support, linguistic insight and advice over the years (including a set of invaluable dos and don'ts for giving an academic presentation like "never … during your own talk"). Instead of 60 candles on a birthday cake, we originally planned to give him 60 linguistic examples on this special occasion, but must admit that we have fallen a bit short of that (if you want to know the exact number, you're welcome to count!). This is when we changed our plans

Eric Fuß & Carola Trips

and decided to present Ian with a neat analysis of the NSR by using his notion of "blank generation". And voilà, it works!

Earlier versions of this paper were presented at DiGS 2010 in Cambridge and WOTM 2010 in Wittenberg. We are very grateful to the audiences for helpful comments and suggestions. In particular, we want to thank Patrick Brandt, Nynke de Haas, Fabian Heck, Roland Hinterhölzl, and Ans van Kemenade. In addition, we benefited from comments by two anonymous reviewers for this volume, which led to a number of improvements.

## **References**


9 The Northern subject rule revisited


### Eric Fuß & Carola Trips


### 9 The Northern subject rule revisited


### Eric Fuß & Carola Trips


9 The Northern subject rule revisited


# **Chapter 10**

# **All those years ago: Preposition stranding in Old English**

Ans van Kemenade

Radboud University

This squib revisits the case for preposition stranding (P-stranding) in Old English as it was argued in the hot debate on *wh*-movement in the 1980s. It looks at more recent literature on the relevant issues, finding that P-stranding in Old English warrants an analysis in terms of *wh*-movement, which should allow for movement of a zero prepositional object out of PP. Examination of the York corpus of Old English adds more detail to the picture known, but largely confirms the findings so far.

# **1 Background**

This squib follows up the discussion and analysis of preposition stranding (Pstranding) in specific types of Old English relative clauses in van Kemenade (1987), which has featured in discussion of various issues in more recent literature (Alcorn 2014; Emonds & Faarlund 2014). My treatment here is based on examination of the *York corpus of Old English* (YCOE) (Taylor et al. 2003); it readdresses some of the theoretical issues, and reconsiders the analysis.

Examples of P-stranding in present-day English are given in (1a–b), exemplifying P-stranding by *wh*-movement in *wh*-relative clauses. *Wh*-movement in relative clauses moves a constituent to Spec,CP (in modern terms), and may involve long *wh*-movement through an intermediate Spec,CP (1b). This *wh*-movement strategy allows preposition stranding relatively freely in present-day English, as in (1a,b):

Ans van Kemenade. 2020. All those years ago: Preposition stranding in Old English. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 221–231. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972846

### Ans van Kemenade

	- b. That's the guy [CP who<sup>i</sup> I thought [CP t<sup>i</sup> I had told you about t<sup>i</sup> ]]

Preposition stranding in Old English is, however, not allowed in constructions comparable to (1). Relative clauses that involve movement of an overt relative pronoun are common in Old English texts, but they do not feature P-stranding (this is also true of *wh*-questions, Allen 1977; 1980). When a prepositional object is relativised, it pied-pipes the preposition along to Spec, CP, as in (2):

(2) Blickling, 89.13 (Allen 1980: 270)

Gehyr Hear ðu thou arfæsta merciful God God mine my stefne, voice, [CP [mid with ðære]<sup>i</sup> which ic I, earm poor (one), to to ðe cleopie ti ]

thee call

There are several other types of relative clauses in Old English that do allow Pstranding, and in which stranding is indeed obligatory. These share the property that they do not have an overt relative pronoun. I give examples of relatives with the invariant complementiser *þe*, with short and long relativisation, in (3) (both from van Kemenade 1987: 147–148), of an infinitival relative in (4), and an example of an adjective+infinitive construction in (5).

	- a. & and het ordered forbærnan burn þæt the gewrit writ [CP 0<sup>i</sup> þe that hit it t<sup>i</sup> on in awriten written wæs] was 'and ordered to burn the writ that it was written in'
	- b. Đonne then hie they lecgeað put ða the tiglan tiles beforan before hie them [CP 0<sup>i</sup> þe that him them beboden ordered wæs was [CP 0<sup>i</sup> ðæt that hie they sceoldon should ða the ceastre city Hierusalem Jerusalem t<sup>i</sup> on on awritan]] draw 'Then they put in front of them the tiles that they were ordered to draw the city of Jerusalem on.'

10 All those years ago: Preposition stranding in Old English

(5) LS8 (Eust) 315 (Fischer et al. 2000: 266) Wæs was seo the wunung dwelling-place þær there swyþe very wynsum pleasant on in to to wicenne live 'The dwelling-place there was very pleasant to live in.'

A special case are relatives with *that* as the relative pronoun form, as we will see below.

P-stranding in constructions such as (3–5) featured prominently in the 1970's and 1980's debate on whether preposition stranding in the North and West Germanic languages is derived by *wh*-movement (Chomsky 1977; Chomsky & Lasnik 1977; Van Riemsdijk 1978: 286–297; Vat 1978; van Kemenade 1987), or by a second relativisation strategy of deletion over a variable (Maling 1976; Bresnan & Grimshaw 1978;Allen 1977;1980), which may involve long-distance deletion. This debate has been resolved to the extent that, as far as the data can show us, both strategies are subject to subjacency (Allen 1980; van Kemenade 1987): they both respect the complex NP constraint and the *wh*-island constraint, and occur only in constructions that allow COMP to COMP movement. In the terms of Chomsky (1977); Chomsky & Lasnik (1977), this means that they must result from *wh*movement. Vat (1978), in the wake of Van Riemsdijk (1978) follows Allen (1977) in showing that Old English has the same type of P-stranding by R-pronouns such as *þær* 'there', satisfying subjacency, and argues that P-stranding in relatives without an overt pronoun must be due to *wh*-movement of *þær*, with subsequent deletion under identity with the antecedent.

Van Kemenade (1987) presents another variant of this analysis. The general ban on P-stranding in Van Riemsdijk's (1978) analysis is accounted for by the status of PP as a bounding node for subjacency. Dutch P-stranding is allowed because Dutch allows an "escape hatch" to this ban in the form of positions on the left of the preposition in (6a–b) that are designated for R-pronouns, the only (overt) items in Dutch grammar that strand a preposition:

	- b. Jan heeft het *daar* gisteren *over* gehad.
	- c. *Daar* heeft Jan het gisteren *over* gehad.

(6c) shows that R-pronouns also move to Spec,CP. Van Kemenade (1987) proposes a parallel analysis for preposition stranding by *þær* and by various types of pronouns in Old English: this is obligatory when the object of the preposition is

### Ans van Kemenade

*þær* 'there', and optional when the object is a personal pronoun (both examples from van Kemenade 1987: 117):

(7) a. Boeth, XXVII, 61, 20

þæt that *þær* there nane no oðre others *on* on ne not sæton sat 'that no others sat (on) there'

b. WSgospel, Mt þa then genealæhte approached *him* him an a man man *to* to 'then a man approached him'

Van Kemenade (1987: 126–35) proposes that this type of pronoun fronting represents a form of cliticisation that is compatible with *wh*-movement, inspired by the fact that it applies to personal pronouns as well, and by the fact that the positions where *þær* and pronouns occur in Old English are special positions in Dutch syntax more generally. She extends this analysis to P-stranding in relatives without an overt pronoun as zero cliticisation, that is, P-stranding in the constructions exemplified in (3–5) are cases of *wh*-movement of a phonetically null variant of *þær* or a personal pronoun.

Let us now turn to a consideration of the merits of this approach in the light of more recent literature, and based on an examination of the *York corpus of Old English* (YCOE, Taylor et al. 2003). These concern a number of issues, which I would like to address in turn:


## **2 Locality conditions**

There is no evidence that the relation between the CP of *þe*-relatives and the variable with which they are associated in any way violates the subjacency condition, as noted above in relation to (3b). This would indicate that *þe*-relatives and related constructions in Old English are derived by *wh*-movement, of a zero clitic, or a zero operator. Note that the *þe*-relative is by far the most frequent

### 10 All those years ago: Preposition stranding in Old English

relative in Old English (over 13,000 examples in YCOE, including some 500 examples of P-stranding), but long *wh*-movement is generally rare in Old English, and (3b) is one of only two examples in the YCOE corpus of a *þe*-relative with a long-distance dependency. Note, nevertheless, that the facts are compatible with subjacency, and I therefore assume, with van Kemenade (1987), that they are derived by *wh*-movement of a zero element that is identified under identity with the antecedent. This is in line with the fact that they are most typically restrictive relatives.

## **3 P-stranding by pro-forms and zero pro-forms**

I now turn to a renewed assessment of the question to what extent it is justified to parallel P-stranding by pronouns and *þær*-adverbs with P-stranding in constructions with an invariant complementiser. An argument in favour of this parallel might be an observation in Alcorn (2014) that there are two spelling variants of the Old English antecedents of the prepositions *by* and *for*, {be} and {for} for unstranded prepositions, and {bi (big, bii, by, bie} and {fore} for stranded prepositions. She argues that the choice between the two is prosodically conditioned, with the stranded variant being prosodically independent. This observation applies equally to prepositions stranded by *þær* and personal pronouns, and those stranded in *þe*-relatives. This suggests that the prepositions involved behave similarly. Observe, however, that this does not necessarily mean that the stranding strategies are the same, it could rather be determined by their pre-verbal or clause-final position.

Allen (1980) argues against the parallelism between stranding by *þær* and personal pronouns and stranding in *þe*-relatives: *þær*-relatives, which also involve stranding, had been introduced into the debate by Vat (1978), who argues that *þe*relatives are really *þær*-relatives with subsequent deletion of *þær* in Spec,CP under identity with the antecedent. Allen argues that *þær*-relatives and *þe*-relatives take different antecedents, with *þær*-relatives occurring with inanimate antecedents only, while *þe*-relatives take any antecedent. This observation is borne out by examination of the YCOE corpus: *þær*-relatives, totalling 315 in number, are frequently found with NP antecedents that have no locative connotation, but these are not animate, they rather comprise rather diverse notions such as '(utter) darkness', 'the heavenly kingdom', 'eternal life', 'the course of things', 'hellfire', 'tortures', 'the fairness of glory', 'wedlock', and so on. Allen also argues that a parallel between stranding by personal pronouns and stranding in *þe*-relatives is problematic in view of the fact that the range of prepositions stranded by pronouns is limited, whereas this is not in the case of *þe*-relatives.

### Ans van Kemenade

An argument not mentioned by Allen which may also be important is that *þe*-relatives are dominantly restrictive, whereas *þær*-relatives are often non-restrictive.

(8) Or\_6:3.136.4.2863

& and on on oþerre other wæs was an a gewrit, writ, þær there wæron were on on awritene written ealra all þara the ricestena richest monna men's noman names 'and in the other was a writ, on which were written the names of all the richest men'

Note that the clause introduced by *þær* in (8) is ambiguous between a V2 main clause and a non-restrictive relative. This is frequently the case in *se*-relatives and *þær*-relatives (cf. Los & van Kemenade 2018). Surely the identification with the antecedent must be subject to tighter restrictions in restrictive relatives, where the relative clause serves to further identify the antecedent.

On the basis of the arguments reviewed so far, we may dismiss an analysis in terms of overt *þær*/pronoun movement to Spec,CP and subsequent deletion, *pace* Vat (1978), since on this analysis we would expect a complete parallelism between *ðær* and *þe*-relatives, and this is not feasible. Van Kemenade's zero cliticisation approach allows a broader set of contexts for extraction, including personal pronouns. Let us suppose that the zero clitic is in effect a zero operator which piggybacks on the escape hatch out of PP that is overtly around in the grammar, and which can be used more liberally in restrictive relatives with a zero operator, and other clauses where the identifying context for the zero operator is strict. There are several analyses to this effect available in the literature. One is Abels (2003; 2012), who casts the escape hatch in terms of phase theory, making crucial use of a zero parallel to R-stranding in Dutch. He proposes that Dutch R-pronouns (including their zero variant) are base-generated on the left of P of a special class of zero place prepositions. An argument against this analysis is thus again that it works for some prepositions only, whereas stranding in *þe*-relatives is general for all prepositions. Another analysis to the same effect is Matsuomoto (2013). He argues for a cyclic linearisation analysis that capitalises on the idea that (zero) prepositional objects can be extracted in contexts where V and P have the same head-complement parameters. In effect, this means that extraction is only possible when the complement of P is on its left (for whatever reason). All analyses along these lines thus make use of a position on the left of P that allows an escape hatch for extraction of the (zero) prepositional object.

### 10 All those years ago: Preposition stranding in Old English

At this point, it is also interesting to look at Old Norse, which has a relativisation strategy with an invariant complementiser *er* or*sem*, in which (zero) prepositional objects are relativised, stranding the preposition (Faarlund 2004: 260, see Maling 1976 for present-day Icelandic). Interestingly, Old Norse also has some form of stranding by pronouns, although apparently on a more limited scale: Faarlund (2004) cites an example of pronoun topicalisation with stranding (2004: 233, his (98)), and of an R-pronoun stranding a preposition in a nonroot question (2004: 258, his (32c)).

Emonds & Faarlund (2014) assume that Old English had no preposition stranding, based on van Kemenade's (1987)'s analysis of stranding in relatives with invariant complementisers as zero cliticisation. This glosses over the fact that zero cliticisation is in fact van Kemenade's analysis of P-stranding in relatives with invariant complementisers, a construction clearly shared by Old English and Old Norse.

An important remaining point are locality conditions: the evidence underlying Allen's (1980) and van Kemenade's (1987) conclusion that the various relativisation strategies respect subjacency is far from robust, although it is consistent across clause types and extraction sites. Abels (2003: 181–186) argues that comparatives of inequality provide the one context which can only involve operator movement. Here, we run into a robustness problem once again: there is only one relevant example of a comparative of inequality with P-stranding in the YCOE corpus:

(9) Or\_2:5.48.36.938

to to beteran better tidun times þonne than we we nu now on in sint are 'in better times than we are in now'

We can conclude that the evidence is consistent with subjacency, although we would like to base this on more robust data. I nevertheless maintain that relatives with invariant complementisers and other *wh*-related constructions with zero operators are movement constructions. There is a general ban on P-stranding, and I follow Abels (2003; 2012) in taking PP to be a phase head. A zero operator can be extracted out of PP, via its Spec, or a Phase edge. I leave the details for further research (see e.g. Walkden 2017, CGSW abstract). The fact that there was stranding was an important basis for extension of stranding to other contexts over the Middle English period.

Ans van Kemenade

## **4 Other instances of P-stranding in relatives**

I now turn to further evidence for stranding in Old English, which also occurs in *that*-relatives, albeit to a limited extent. This is an interesting construction to consider, since the *þe*-relative is presumably the historical precursor of the present-day English *that*-relative, which is also typically assumed to involve *wh*movement, either of a null operator, or of a*wh*-pronoun with subsequent deletion under identity with the antecedent. Old English *that*-relatives are ambiguous: we could regard *that* as an overt demonstrative pronoun, which would make the *that*-relative a neuter gender instance of the *se*-relative (which is usually nonrestrictive); we could alternatively regard it as an early instance of an invariant complementiser. There is evidence both ways: of the total of 2,743 examples of *se*-relatives in the YCOE corpus, I found 42 coded as *se*-relatives with stranding. All of these have a demonstrative as relative pronoun, and the complementiser *þe*. 21 of the cases have *ðæt* as the relative marker, and have straightforward neuter antecedents, such as (10) with neuter *sweord* as antecedent; a further 12 have two *þæt* forms, the neuter demonstrative pronoun *ðæt* as antecedent, and *þæt* as the relative marker, as exemplified in (11); two examples have a feminine antecedent (12). Four examples have a relative form other than *þæt*, viz. *þære* (feminine genitive/dative singular, with a feminine antecedent); *þæm* (masculine/neuter dative singular), or *þa* (masculine/neuter nominative/accusative plural). This once again includes (12), remarkably with a feminine antecedent *mægþe*.

(10) Bede\_2:10.138.4.1327

Þa then sealde gave se the cyning king him him (a) *sweord*, sword, *þæt* that he he hine himself mid with gyrde; girded … 'Then the king gave him a sword, which he girded himself with'

(11) CP:46.351.5.2368

…, … sua so him them læs less licað pleases *ðæt* that *ðæt* that hie they to to gelaðode called sindon, are '…, the less they are pleased with that to which they are called'

(12) Bede\_5:22.478.23.4805

Ond and þonne then Norþanhymbra Northumbrians' *mægþe* province *þæm* that Ceolwulf Ceolwulf se the cyning king in in cynedome kingship ofer over is, is 'And in the province of Northumbria, over which King Ceolwulf reigns'.

### 10 All those years ago: Preposition stranding in Old English

The majority of these examples (21 + 12) is thus compatible, on the one hand, with a pronominal interpretation of *that* (since in most cases the antecedent is neuter) and on the other hand with that of an invariant complementiser (assuming that P-stranding involves zero operator movement). The cases with a feminine antecedent (2 in total) suggest that *that* is an invariant complementiser, since a gender mismatch between antecedent and relative pronoun would not be expected. The cases with pronominal forms other than *that* (4 in total) suggest, on the other hand, that movement of the pronoun strands the preposition, since the form of the pronoun is incompatible with an interpretation as invariant complementiser. Old English *that*-relatives with stranding thus suggest some evidence for P-stranding by an overt relative pronoun, in a specific context.

The YCOE corpus also features two examples of relatives coded as *se þe* relatives with P-stranding. One of these seems to be unreliable, as it is presumably not a *se*-relative but a *þe*-relative on an antecedent that is appositive in the context. (13) looks like a bona fide case of a *se þe* relative with P-stranding.

(13) Bede\_4:31.376.2.3751

Swylce such eac also ealle all ða the hrægl, robes, þa which ðe that he he mid with gegearwad attired wæs, was 'Also all the robes in which he was attired, …'

The observations about *that*-relatives fit well with the analysis sketched here: *þæt* is at this stage of the language clearly to some extent ambiguous between relative pronoun status and its later grammaticalised complementiser status, witness the fact that it features a substantial number of cases of P-stranding. We also find the first instances of unambiguous P-stranding by a relative pronoun as in (13).

In conclusion, we can say that the findings of the 1980's literature on P-stranding largely hold up. This applies to the theoretical analysis (any analysis must somehow allow for relatively free extraction out of PP when the prepositional object is a zero element), as well as to the factual coverage now allowed by the YCOE corpus (we can present more detail now, but there are no facts that were glossed over earlier).

## **Abbreviations**

YCOE York corpus of Old English

Ans van Kemenade

## **References**


10 All those years ago: Preposition stranding in Old English


# **Chapter 11**

# **From macro to nano: A parametric hierarchy approach to the diatopic and diachronic variation of Italian** *ben*

Norma Schifano University of Birmingham

# Federica Cognola

La Sapienza University, Rome

In this squib we discuss the morpho-syntactic requirements affecting the distribution of the Italian discourse particle *ben* (lit. 'well') as employed in a selection of regional varieties of the language. We present a preliminary comparison with its attestations in earlier stages of the language and we show how the attested diatopic and diachronic variation may be modelled in terms of a parameter hierarchy of the type developed by the *ReCoS* team.

# **1 Introduction**

The aims of the following squib are: (i) introducing the morpho-syntactic requirements affecting the distribution of a poorly studied discourse particle, namely Italian *ben* (lit. 'well'), as employed in a selection of regional varieties of the language, building on the work in Cognola & Schifano (2015; 2018a,b) (§2), (ii) presenting a preliminary comparison with its attestations in earlier stages of the language (§3), and (iii) showing how the attested diatopic and diachronic variation are particularly relevant for our understanding of comparative syntax in that, far from being random, they fit the predictions of the parametric hierarchy approach, as developed by the *ReCoS* team (Roberts 2012; Biberauer & Roberts

Norma Schifano & Federica Cognola. 2020. From macro to nano: A parametric hierarchy approach to the diatopic and diachronic variation of Italian *ben*. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 233–250. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972848

### Norma Schifano & Federica Cognola

2012; 2015; 2016; Biberauer, Holmberg, et al. 2014; Biberauer, Roberts & Sheehan 2014, a.o.; §4).

The challenge raised by the behaviour of this element is twofold. On the one hand, particles represent a "poorly understood group of elements" (Biberauer & Sheehan 2011: 387) which raise a number of both empirical and theoretical questions, including i.a. a proper understanding and adequate description of their individual syntactic functions, of their (lack of) ordering restrictions, as well as defectivity, optionality, degree and path of grammaticalization and pragmaticalization, semantic contribution, etc. (Biberauer & Sheehan 2011; Biberauer, Haegeman, et al. 2014). On the other hand, the sub-category of *discourse* particles introduces a number of even more complex issues.<sup>1</sup> According to Zimmermann's (2011: 2012) semantic criterion, discourse particles can be defined as "expressions [which] contribute only to the expressive content of an utterance, and not to its core propositional content" (cf. also Bayer & Obenauer 2011: 450, a.o.). This means that any formalization of discourse particles must be able to capture not only their syntactic behaviour and structural status (as, for example, (deficient) adverbs, Cardinaletti 2011; 2015; Manzini 2015; TP pro-forms, Haegeman & Weir 2015; speech act functional heads, Haegeman 2014; Hill 2014; propositional anaphors, Hinterhölzl & Munaro 2015), but also their discourse properties, which involve highly heterogeneous non-syntactic categories such as speakers' "emotional needs" (von der Gabelentz 1969 [1891]; cf. i.a. the expression of commitment, e.g. German *wohl*, Zimmermann 2011; confidence, e.g. *ben* in some varieties of Italian, Coniglio 2008; Cardinaletti 2011; surprise, e.g. Cantonese *me1*, Li 2006; surprise-disapproval, e.g. Bangla *ki*, Bayer 1996; concern, e.g. German *denn*, Bayer & Obenauer 2011; impatience, e.g. Dolomitic Ladin *po*, Hack 2014: 52) or context/common ground dependence (cf. i.a. presupposition, e.g. Italian *mica*, Cinque 1976, Zanuttini 1997; existence of mutual knowledge, e.g. German *ja*, Zimmermann 2011; evidentiality, e.g. Bellunese *lu*/*ti*/*mo*/*po*, Hinterhölzl & Munaro 2015), just to mention a few.

In what follows, we leave these issues aside, simply assuming that Italian *ben* is a discourse particle located in the IP area (see further discussion in Cognola & Schifano 2018a,b).<sup>2</sup> Conversely, we focus our attention on the diatopic distribution of this element, as this proves to be particularly interesting in that it is

<sup>1</sup>A wider related issue concerns the notion of "discourse" itself, which is too vast for us to be able to discuss it here. The reader is referred to Bayer et al. (2015) for an updated overview of some of the most prominent proposals about its codification and relationship with syntax.

<sup>2</sup>As for the syntactic status of *ben*, the reader is referred to Cognola & Schifano (2015; 2018a,b), where *ben* is analysed as a weak XP (in the sense of Cardinaletti & Starke 1999), as it is subject to the series of syntactic restrictions affecting weak elements (e.g. impossibility of fronting, coordination, focusing) which do not extend to *ben(e)* when used as a manner adverb. More

11 From macro to nano

subject to an increasing set of morpho-syntactic restrictions which reflect the macro > meso > micro > nano typology of parameters of the kind advocated by the *ReCoS* approach. Accordingly, we claim that the fine-grained diatopic variation which affects Italian *ben* can be modelled in terms of a parameter hierarchy, which allows us to gain insights also into the diachronic development of this element. The case of Italian *ben* thus provides evidence that the adequacy of the parametric hierarchy approach stretches to (the morpho-syntactic behaviour of) elements at the syntax-discourse interface.

# **2 Italian: Diatopic variation in morpho-syntactic requirements**

Consistently with the cross-linguistic behaviour of manner adverbs, which are known to have developed homophonous forms with a discourse value both in Romance (Belletti 1990; 1994; Lonzi 1991; Cinque 1976; 1999; Vinet 1996; Waltereit & Detges 2007; Coniglio 2008; Hernanz 2010; Cardinaletti 2011; Padovan & Penello 2014, a.o.) and Germanic (Weydt 1969; Baardewyk-Resseguier 1991, a.o.), the Italian manner adverb *ben(e)* 'well' (1a,b) co-exists with the non-adverbial element *ben* (1c), which has been traditionally described as conveying an emphatic/assertive meaning, used to reinforce the assertion (Belletti 1990; 1994; Lonzi 1991) and to express speakers' confidence about the propositional content of their assertion (Coniglio 2008; Cardinaletti 2011):<sup>3</sup>

(1) Italian (Cinque 1999: 171, fn. 20)

a. Carlo C. disegna draw.prs.3sg *bene*. well 'Carlo is good at drawing.'

specifically, we assume that when it is used as a discourse particle, *ben* is licensed in NegPresuppositionalP by a silent negative operator in ForceP and receives its presuppositional character by a Focus in PolarityP (see Hernanz 2010 for the role of PolarityP in the licensing of Spanish *bien*). Also note that, according to the above definition of discourse particles (also called *modal particles* in the literature due to their semantics and position in the clause, see Weydt 1969), these elements have to be kept distinct from so-called conversational-management elements, in that the latter have a pragmatic function similar to discourse particles, but are typically hosted in the CP layer. Interestingly, the Italian manner adverb *bene/ben* has also developed an usage as a conversational-management element (cf. *be'*).

<sup>3</sup>The translation of *ben* in (1b) is the one offered in the cited work. On the whole, *ben* does not seem to have an immediate corresponding form in English, where it could at best be rendered with an emphatic stress on the verb or as *indeed*. As such, it will not be translated in the examples below coming from our corpus of contemporary Italian, while it will be rendered with various periphrases in the early examples, according to the context.

Norma Schifano & Federica Cognola


In Cognola & Schifano (2018a,b), we have argued instead that the core property of Italian *ben* is that of denying the interlocutor's negative presupposition (cf. also Waltereit & Detges 2007 on French and Hernanz 2010 on Spanish), i.e. *ben* can only occur in (syntactically positive) contexts in which the negative counterpart of the proposition expressed by the sentence is part of the common ground (cf. Cinque 1976 on *mica*):

	- a. Speaker A: (negative presupposition) Nicola N. non not l'avrebbe it=have.cond.3sg neanche even toccata touched quella that roba. stuff 'Nicola wouldn't even have touched that stuff.'
	- b. Speaker B: (negative presupposition denied) Nicola N. l'avrebbe it=have.cond.3sg *ben* ben mangiata eaten la the carne. meat 'Nicola would have eaten the meat.'

In order to shed further light on the behaviour of this element, we collected data with native speakers and we found that regional varieties of Italian can be classified into three main groups, in accordance with the morpho-syntactic requirements exhibited by *ben*, i.e. Group 1 (Trentino), Group 2 (mainly Venetan varieties) and Group 3 (Rovigo, plus localities in Friuli Venezia Giulia, Lombardy, Piedmont, Emilia Romagna, Lazio, Marche and Puglia).

Looking at the morpho-syntactic requirements in more detail, the following restrictions can be identified:<sup>4</sup>

<sup>4</sup>The following morpho-syntactic restrictions were identified through a questionnaire run with 28 speakers of mixed age, gender and education from 15 different localities, who were asked for grammaticality judgements on a 1–5 scale on 67 sentences testing the occurrence of *ben* across a variety of verb forms and tense, aspect, mood (TAM) contexts (see Cognola & Schifano 2018a,b for details). The reader is referred to the aforementioned works for a discussion of one additional morpho-syntactic restriction which was identified for Group 3 (cf. a preference for transitive over unaccusative verbs) and a difference in the interpretative requirements of *ben* between Group 1 and 3 (cf. negation of implicit vs explicit negative presupposition).

11 From macro to nano

	- a. embedded non-root contexts are ruled out (Restriction 1);
	- b. TAM combinations not involving a non-finite form are ruled out (Restriction 2);
	- c. among restructuring verbs, *potere* 'can' is widely accepted, *volere* 'want' is more restricted and *smettere* 'stop' is largely ruled out (Restriction 3);

While all the three restrictions apply to Group 3, Group 1 is only subject to Restriction 1.<sup>5</sup> By way of illustration, consider the examples below, showing that embedded non-root contexts like the ones selected by a matrix volitional verb are ruled out in both groups (4); while both simple and compound tenses are admitted by Trentino speakers, only the latter combination is admitted by speakers of Group 3 (5); while Trentino allows *ben* to occur with *potere*/*volere*/*smettere*, only the former is completely grammatical in all the tested contexts in Group 3 (6):<sup>6</sup>

	- a. \* Gianni G. vuole want.prs.3sg che that Marco M. compri buy.sbjv.3sg *ben* ben qualcosa something per for cena.
		- dinner

'Gianni wants Marco to buy something for dinner.'

	- a. Group 1/\*3

Gianni G. compra buy.prs.3sg *ben* ben qualcosa something per for cena dinner quando when può. can.prs.3sg 'Gianni buys something for dinner when he can.'

<sup>5</sup>Note however that, for all speakers, *ben* can be used in root-like embedded clauses, like in embedded clauses introduced by a verbum dicendi.

<sup>6</sup> See further examples in Cognola & Schifano (2018a,b).

### Norma Schifano & Federica Cognola

### b. Group 1/3

Gianni G. avrebbe have.cond.3sg *ben* ben comprato bought qualcosa something per for cena, dinner se if avesse have.sbjv.ipfv.3sg potuto. been.able 'Gianni would have bought something for dinner if he had been able to.'

### (6) Group 3 (Italian)


On the basis of the distributional facts summarised above, we classify Trentino as the productive isogloss for the use of *ben*. Conversely, Group 3 allows a considerably more restricted usage and Group 2 represents a transitional area between the two, where the above restrictions do not apply consistently yet. One of the most striking results of this investigation is that the localities in Group 3 behaved surprisingly homogeneously, in spite of their geographical scatteredness, suggesting that once outside the productive isogloss, any varieties conform to the same behaviour. In what follows, we shall not attempt at accounting for the above restrictions (see Cognola & Schifano 2018a,b for a proposal), but we will instead consider a representative set of examples regarding the distribution of this particle in earlier attestations of Italo-Romance in order to assess whether the more liberal pattern of Trentino instantiates an innovative or conservative stage in the distribution of *ben*.

11 From macro to nano

# **3 Italian: Diachronic variation in morpho-syntactic requirements**

A preliminary examination of occurrences of *ben* in earlier attestations of Italo-Romance suggests that the extensive use of Trentino reflects a conservative stage, where *ben* occurred in a wider range of TAM contexts than present-day (standard) Italian.<sup>7</sup> More specifically, we observe that (i) the particle was already employed to deny a negative presupposition, and (ii) although occurrences of *ben* in non-root embedded contexts do not seem to be attested (cf. Restriction 1),<sup>8</sup> the particle was not only allowed with compound tenses, such as the present perfect (7a) and pluperfect indicative (7b), as well as with restructuring verbs like *potere* 'can' (7c), but was also readily admitted with simple tenses, such as the present indicative (8), the imperfect indicative (9), the simple past (10) and the simple future (11), on a pair with modern-day Trentino and unlike the contemporary Italian varieties of Group 3 (cf. Restriction 2):<sup>9</sup>

(7) a. (negative presupposition: the knight does not deserve to be treated in such an uncivil manner)

> Così tenendo lor camino, trovaro il re Meliadus ch'andava a uno torneamento, altressì a guisa di cavaliere errante e sue arme coverte. E' domandò questi sergenti: "Perché menate voi a 'mperatore questo cavaliere? E chi [è] elli, che cosìe lo disonorate villanamente?" Li sergenti rispuosero: "Elli hae *bene* morte servita; e se voi il sapeste come [noi], voi il menareste assai più tosto di noi. Adomandatelo di suo misfatto!" (*Novellino*, LXIII, p.267, l.20–28) 'Along the road they met King Meliadus, on his way to a tournament, also dressed as a knight errant and hiding his weapons. He asked the

<sup>7</sup>We take "standard" Italian to pattern with Group 3, as shown by the scores provided by our central-southern informants, whose judgements refer to their competence of the standard language, *ben* being absent both from their regional varieties of Italian and their local Romance dialects. The diachronic data reported below are taken from two central-northern varieties only, namely Old Tuscan (cf. *Novellino*, about end of the 13th century) and Old Venetan (cf. *Lio Mazor*, 14th century). We therefore do not exclude the possibility that other early varieties of Italo-Romance behave differently. The English translation provided for *Novellino* have been freely adapted from Consoli's (1997) edition.

<sup>8</sup>That *ben* should be excluded from non-root embedded contexts also in the early varieties under review here is not surprising under the analysis proposed in Cognola & Schifano (2018a), where *ben* is licensed by a negative operator in ForceP, as argued elsewhere for other discourse particles (see Coniglio 2008 and Zimmermann 2004; 2011, a.o.).

<sup>9</sup>We leave it open to future research to determine whether Restrictions 3 was active or not in the early varieties under investigation here.

vassals: "Why are you carrying this knight to the emperor? And who is he, that you are dishonouring him in such an uncivil manner?" The vassals replied: "He well deserves to die; and if you knew why, you would be carrying him faster than us. Ask him yourself about his crime!"'

b. (context: there is a quarrel involving Lena's son and Pero Stomarin. Lena's son was supposed to give Pero Stomarin money for the fish, but according to Çanun he has kept it for himself. Negative presupposition: Lena's son has not given the money to Pero Stomarin)

[…] la quala dis che Çanun diseua che lo fio de Lena aueua toleto li deneri del pes da Siluester Uener et lo fio dis ch'el li aueua *ben* dati a Pero Stomarin. (*Lio Mazor*, p.48, l.160–163)

'[…] she said that Çanun said that Lena's son had taken the money for the fish from Siluester Uener and the son said that he had indeed given it to Pero Stomarin.'

c. (negative presupposition: the infant girl cannot be the doctor's daughter)

Uno medico di Tolosa tolse per mogliera una gentile donna di Tolosa, nepote dell'arcivescovo. Menolla. In due mesi fece una fanciulla. Il medico non ne mostrò nullo cruccio, anzi consolava la donna e mostravale ragioni secondo fisica, che *ben* poteva esser sua di ragione […]. (*Novellino*, XLIX, p.234, l.3–7)

'A doctor from Toulouse took for his wife a gentle woman of Toulouse, the niece of the Archbishop. He brought her home. Two months later she gave birth to an infant girl. The doctor showed no signs of anger, instead he consoled his lady and presented many reasons, according to the law of physics, which logically proved the child could be his.'

(8) (negative presupposition: your god is not better)

E tornando al signore per iscommiatarsi da lui, il signore disse: – Or sei tu ancor qui? Non avestu la torta? – Messer sì, ebbi. – Or che ne facesti? – Messere, io avea allora mangiato: diedila a un povero giullare che mi diceva male perch'io vi chiamava mio Iddio. – Allora disse il signore: – Va' con la mala ventura: ché *bene* è miglior il suo Iddio che 'l tuo! – E disseli il fatto della torta. (*Novellino*, LXXIX, p.309, l.30–39) '[the minstrel] returned to his lord to formally take his leave, and his lord said: – You're still here? Didn't you receive a tart? – Sire, I had it – What

### 11 From macro to nano

did you do with it? – Sire, I had already eaten: I gave it to that poor minstrel who chided me for calling you my god. – Then the lord said: May misfortune follow you, for it is true that his god is better than yours! – And then he told him all about the tart.'

(9) (context: a watchman sees a boat in the sea which looks like Nasel's one. The watchman orders the man on the boat to dock, but the person refuses and gets a fine. The judge asks the watchman whether he knows the man on the boat and the watchman replies no. Negative presupposition: the boat was not Nasel's) Domandà s'el lo cogno[se]se, li dis, no; mo lo burclo era *ben* del Nasel. (*Lio Mazor*, p.43, l.11)

'[the judge] asks whether he knows [the man on the boat], he says "no"; but the boat was indeed Nasel's.'

(10) (negative presupposition: you didn't see them passing) Quell'altro cavalcò poi più volte, tanto che udì il padre e la madre fare romore nell'agio, e intese dalla fante com'ella n'era andata in cotal modo. Questi sbigottì: tornò a' compagni e disselo loro. E que' rispuosero: – *Ben* lo vedemmo passar con llei, ma nol conoscemmo: et è tanto, che puote bene essere allungato; et andarne per cotale strada.

(*Novellino*, XCIX, p.350, l.47–53)

'The other man rode past her house many times, until he heard her mother and father making a ruckus inside, and he learned from the maidservant what had taken place. He was mortified: he returned to his companions and told them all *[i.e. that the lady had left with another man, without being seen]*. They replied: – We did see him pass with her, but we didn't recognise him: and it was so long ago, they must be far away by now; this is the road they took.'

(11) (context: there is a quarrel involving Maria, Magdalena and Francesca. Maria wants to buy some bread from Magdalena. She takes a piece of bread, but another woman, Francesca, grabs it from her hands. There is a fight between the two women. Magdalena understands that Maria wants to steal the bread and Maria answers as below. Negative presupposition: you will not pay for the bread)

[…] no me-lo tor, ch'e' tel pagarò *ben*. (*Lio Mazor*, p.27, l.12) 'don't take it away from me, that I will indeed pay you for that'

The availability of *ben* across a wide selection of TAM contexts (vs. Restriction 2) exemplified in (8–11) points to a high degree of grammaticalization of this

### Norma Schifano & Federica Cognola

particle in earlier stages of Italo-Romance, a situation which today persists in Trentino, i.e. the productive isogloss, but not elsewhere.<sup>10</sup> Accordingly, we suggest that the distribution of *ben* in Trentino reflects a conservative pattern. The reason why Trentino has preserved an earlier stage of the language, unlike all the other varieties of the Italian peninsula under investigation here, may be linked to the contact with German (in terms of reinforcement of a shared property, see Benincà 1994; Cordin 2011; Cognola 2014), which makes a very productive use of discourse particles (see Cognola & Schifano 2018b for a parallel between Italian *ben* and German *doch* and Weydt 1969, among many others, on German discourse particles). As for the other varieties, these show a reduced distribution of *ben* which, from a diachronic perspective, may be interpreted as an example of retraction (Norde 2011), i.e. it also reflects the steps of a diachronic process whereby *ben* was originally allowed in all the contexts admitted in early Italo-Romance and still retained by Trentino. Our fine-grained diatopic investigation has shown that, despite their geographical scatteredness, all the speakers outside the productive isogloss are remarkably consistent in their judgements. We take this to indicate that the retraction of *ben* from early Italo-Romance to the present-day varieties outside the productive isogloss has followed the same path. This diatopic and diachronic path can be informally represented as in (12):<sup>11</sup>

	- a. lexical verbs: simple tenses → compound tenses
	- b. functional verbs (cf. restructuring): *smettere* 'stop' → *volere* 'want' → *potere* 'can'

The path in (12) reads as follows: among lexical verbs, *ben* is first lost with simple tenses; among restructuring verbs, it is first lost with *smettere* 'stop' and, partially, with *volere* 'want'. (12) can also be read as a synchronic implicational

<sup>10</sup>Here we are also glossing over the (apparently) distinct placements of *ben* in the examples (8–11), including its preverbal placement (cf. 7c, 8, 10, which have to be interpreted in the light of the distinct word order restrictions which were active in earlier varieties of Italo-Romance (see Ledgeway 2012 for an overview and references) and which are not immediately relevant for the purposes of the present discussion. We leave it open to further research to determine the exact position of *ben* in early Italo-Romance varieties and to establish whether the analysis offered by Cognola & Schifano (2018a,b) can capture this variation. We also note, in passim, that the full form *bene* too was allowed in its discourse particle meaning (7a), (8), unlike in present-day regional Italian.

<sup>11</sup>Note that the geographical factor is not totally irrelevant here, as localities closer located to Trentino (see varieties in the transitional Group 2 like Cortina D'Ampezzo) allow *ben* in a wider selection of contexts than other localities of the same group.

### 11 From macro to nano

scale, i.e. if a variety admits *ben* with simple tenses, it will also admit it in all the other contexts, as shown by Trentino (and partly by the early varieties under investigation here, pending further research on a wider corpus).

## **4 Italian: Towards a parameter hierarchy**

In the remainder of this work, we would like to capture the implicational relationships described in (12) in terms of a parameter hierarchy. Following the latest advancements by the *ReCoS* group (Roberts 2012; Biberauer & Roberts 2012; 2015; 2016; Biberauer, Holmberg, et al. 2014; Biberauer, Roberts & Sheehan 2014, a.o.), we adopt the taxonomy of parameter-types outlined in (13) and schematized in Figure 11.1 (taken from Biberauer & Roberts 2012 and Biberauer & Roberts 2016):

	- a. *Macroparameters*: all functional heads of the relevant type share *v<sup>i</sup>* ;
	- b. *Mesoparameters*: all functional heads of a given naturally definable class, e.g. [+V], share *v<sup>i</sup>* ;
	- c. *Microparameters*: a small subclass of functional heads (e.g. modal auxiliaries) shows *v<sup>i</sup>* ;
	- d. *Nanoparameters*: one or more individual lexical items is/are specified for *v<sup>i</sup>* ;

The central idea summarised in (13) and Figure 11.1 is that a macroparametric effect obtains when a given property holds for all relevant heads, and is therefore easily set by the learner and likely to be stable over millennia. As one moves downward the hierarchy, the subset of heads characterised by the relevant property increasingly reduces, moving from a natural-class subset of heads (cf. mesoparameter), through a further restricted natural-class subset of heads (cf. microparameter), to a reduced set of lexically specified items (cf. nanoparameter), all increasingly less salient in the primary linguistic data (PLD) and consequently less resistant to reanalysis (Biberauer & Roberts 2016: 261).

Turning our attention again to the morpho-syntactic distribution of *ben* described above, which gradually decreases as one moves outside the productive isogloss of Trentino (cf. Group 3 and 1, respectively), passing through a grey area of variation (cf. Group 2), we immediately realise that this kind of diatopic variation remarkably reflects the path of specialization predicted by the above taxonomy. If we label the discourse function carried out by *ben* as "negative marking

Figure 11.1: General format of parameter hierarchies

of negative presupposition", as argued in Cognola & Schifano (2018a), we observe that in the early varieties discussed above (here cumulatively referred to as "early Italo-Romance") and in Trentino, such marking is allowed on all [+V] heads, that is a natural-class subset of heads, corresponding to a mesoparametric option. Conversely, Group 3 seems to split its behaviour. As far as its lexical verbs are concerned, these clearly instantiate a microparametric option, with *ben* being attested in a further restricted natural-class subset of heads, namely [+V] perfective heads (cf. Restriction 2). Conversely, its functional (viz. restructuring) verbs represent a nanoparametric choice, in that *ben* seems to be allowed only on lexically specified items (cf. *potere* vs. *smettere*).<sup>12</sup> The relevant portion of this hierarchy is sketched in Figure 11.2.

The fact that Group 3 simultaneously instantiates both a micro and nanoparametric option or, more precisely, that lexical vs. functional verbs are split in their behaviour, may be unexpected under the taxonomy in (13) and Figure 11.1, but finds a plausible explanation if we consider the diachrony. As discussed in §3, the

<sup>12</sup>The restrictions on the occurrence of *ben* with restructuring verbs do not seem to be amenable to an alternative explanation to their nanoparametric classification proposed here. Indeed, the position of the tested restructuring verbs in Cinque's (2006) hierarchy, which could plausibly play a role, does not seem to be relevant here, as *volere*, which is the highest, is less accepted than *potere*, which is the lowest, but *smettere*, which lexicalises a position between the two, is largely ruled out (see also Cognola & Schifano 2018b).

11 From macro to nano


Figure 11.2: The distribution of *ben*

distribution of *ben* in Group 3 is likely to represent a reduction of a previously much more extended usage, i.e. it is an instance of diatopic variation which reflects a diachronic path. A closer look at the data presented in Cognola & Schifano (2018a,b) suggests that such a retraction may still be on-going.<sup>13</sup> Under this hypothesis, the behaviour of Group 3 and the representation in Figure 11.2 are no longer surprising. That the lower branches represent unstable options is indeed consistent with current assumptions on diachronic change within the parametric hierarchy approach, where micro- and nanoparametric options are taken to be highly unstable (Biberauer & Roberts 2016: 261).

<sup>13</sup>For example, our investigation with native speakers has shown that there is a tendency for modally-marked compound tenses (e.g. conditional perfect) to score better than the temporallyrelated ones (e.g. present perfect). Similarly, if a compound form allows both a temporal and a modal reading (e.g. future perfect with temporal vs. epistemic interpretation), the latter is usually preferred.

This further and rather subtle specialization with compound tenses (i.e. the only morphosyntactic combination allowed with lexical verbs), not necessarily shared by all speakers yet, may indicate that the retraction of *ben* in Group 3 is still on its way. See Cognola & Schifano (2015) for data showing a similar tendency with *smettere* (i.e. largely ruled out, but modallymarked interpretations receive higher scores).

Norma Schifano & Federica Cognola

## **5 Conclusions**

In the present squib we have discussed the distribution of the discourse particle *ben* in a selection of regional varieties of Italian, as described in Cognola & Schifano (2015; 2018a,b). On the basis of the judgements expressed by native speakers, we have identified three main morpho-syntactic restrictions which affect the distribution of *ben* in Group 3 but not in Group 1, which we take to be the productive isogloss. A preliminary examination of diachronic evidence has also suggested that the more liberal use of Trentino reflects an earlier stage of Italo-Romance, where *ben* was also allowed in wide array of TAM-contexts. In conclusion, we have suggested that the attested diatopic variation can be successfully formalised in terms of a parameter hierarchy, in that the gradual retraction of the admitted contexts we described finds a remarkable parallel with the macro > meso > micro > nano path independently argued for by the parametric hierarchy approach on the basis of extensive diachronic and typological evidence. This also allows us to provide new insights into the diachronic development of *ben* from early Italo-Romance to present-day varieties. The advantage of modelling the (shrinking) diatopic and diachronic distribution of *ben* via a parameter hierarchy is that it allows us to formally capture a type of variation which would otherwise look like random change (see for example the *potere* vs. *volere* restriction, here captured as a nanoparametric option). The case of Italian *ben* also opens the way to future research on the possibility that the (morpho-syntactic) behaviour of elements at the syntax-discourse interface is also subject to the predictions of parametric hierarchy approach.

## **Abbreviations**


## **Acknowledgements**

We would like to dedicate this squib to Ian Roberts, whose work and outstanding scholarship has greatly inspired our own investigations. We hope that it is

11 From macro to nano

successful in showing how his research, including the one conducted with the *ReCoS* team, is opening the way to new thrilling lines of investigation.

Although this entire work stems from joint research, for the administrative purposes of the Italian academia Norma Schifano takes responsibility for §1, §2 and §4 and Federica Cognola for §3 and §5.

## **Sources**

*I monumenti del dialetto di Lio Mazor*. 1904. Ugo Levi (ed.). Venice: Visentini. *Il Novellino*. 1970. Guido Favati (ed.), *Testo critico, introduzione e note*. Genoa:

Fratelli Bozzi. (English translations adapted from Consoli 1997).

## **References**


Benincà, Paola. 1994. *La variazione sintattica*. Bologna: Il Mulino.



### 11 From macro to nano


### Norma Schifano & Federica Cognola


# **Part II**

# **Syntactic interfaces**

# **Chapter 12**

# **In search of prosodic domains in Lusoga**

## Larry M. Hyman

University of California, Berkeley

In this paper I raise the question of whether Lusoga, a Bantu language of Uganda, recognizes syntactically determined prosodic domains, which have been extensively described in near-mutually intelligible Luganda. I first briefly recapitulate the syntactic constructions that give rise to the tone group (TG) and tone phrase (TP) domains in Luganda and then consider the same constructions in Lusoga. Whereas the expectation is that pre-verbal constituents will be treated prosodically differently than post-verbal constituents in SVO Bantu languages, Lusoga treats both pre- and post-verbal constituents the same, including both left- and right-dislocations. While certain clitics do form a TG with the preceding word, perhaps forming a recursive phonological word, there is nothing corresponding to the multiword TG or TP of Luganda. Lusoga either fails to distinguish phonological phrases or if they do exist in the language (as universally claimed), Lusoga fails to mark them. I conclude that linguistic typology should not only determine how universal linguistic properties can be reflected in the grammar of a language, but also in how well a grammar can get along without signaling them at all.

> "… the very types of prosodic category above the foot and syllable are syntactically grounded and universal." (Selkirk & Lee 2015: 3)

> "… the prosodic phonology of Luganda is among the most intricate and complex of any language." (Hyman & Katamba 2010: 69)

Larry M. Hyman. 2020. In search of prosodic domains in Lusoga. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 253–276. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972850

Larry M. Hyman

## **1 Introduction**

The purpose of this paper is to raise the question whether the phrasal tonology of Lusoga (Bantu; Uganda), the most closely related language to Luganda, is syntactically grounded – or is free to apply without respect to syntax. Outside of Bantu, cases have been reported where phrasal or post-lexical tonology applies whenever two words meet within a clause, independently of the syntax, and hence without the need of prosodic domains. This includes the VSO Chatino languages of Mexico (Cruz 2011; Campbell 2014; McIntosh 2015; Sullivant 2015; Villard 2015) and the SOV Kuki-Thaadow language (Kuki-Chin; NE India, Myanmar) (Hyman 2010). In such languages appropriate tonal alternations occurring between words are blocked only by pause or "sentence breaks".

The story is considerably different in the Bantu languages. Although there is considerable variation, the expectation is that there will be extensive interaction between the syntax and the prosodic phonology, specifically between syntactic constituency and/or information structure (focus) with tone and/or penultimate lengthening. Specifically, we expect the SVO syntax to be prosodically reflected by an asymmetry between what precedes vs. follows the verb. Thus, in a number of works on Luganda, e.g. Hyman et al. (1987), Hyman & Katamba (2010), we have recognized the following postlexical domains within which tone rules act on the lexical stem and word tones:<sup>1</sup>

	- b. a larger tone phrase (TP), within which H tone anticipation (HTA) occurs

One question is whether this sensitivity to syntax can be attributed, perhaps universally, to the SVO syntax of Luganda (and other Bantu languages), or whether the prosodic phonology of an SVO language can also apply across the board, without any sensitivity to syntactic structure.

As I will show below, despite its near-mutual intelligibility with Luganda, Lusoga provides no evidence of prosodic domains above the phonological word. In what follows I will first briefly identify the above Luganda domains, then consider the corresponding structures in Lusoga, which show no empirical evidence for either prosodic domain. I will then discuss what Lusoga does have and what this might mean for syntax–phonology interactions and the quest for universals.

<sup>1</sup>We also recognize an intersecting clitic group (CG), which pertains mostly to vowel length alternations.

12 In search of prosodic domains in Lusoga

## **2 Prosodic domains in Luganda**

The analysis of Luganda tone is given in (2), as summarized by Hyman & Katamba (2010: 70):


As indicated, moras are either marked by an underlying privative /H/ or are toneless (∅). Within the lexical (word-level) phonology, L tones arise in one of two ways, illustrated in (3).

(3) a. /ba-lab-a/ → bá-làb-a 'they see' H H H L b. /ba-bal-a/ → bá-bàl-a 'they count' H H L

In (3a) Meeussen's rule converts a sequence of Hs on successive moras to one H followed by all Ls. A sequence /H-H-H-H/ would thus become H-L-L-L. In (3b) L tone insertion applies after a lone H which would not be subject to Meeussen's rule. The result is an intermediate ternary contrast between H, L, and ∅. Finally, after the phrasal phonology applies, the ∅s are all filled in with either H or L, thereby bringing the system back to a binary contrast, this time equipollent.<sup>2</sup>

### **2.1 The TP**

We are now ready to consider the two prosodic domains mentioned in (1). As illustrated in (4), within the TP, H tone is anticipated across words onto any number of preceding toneless moras, indicated here and in subsequent examples by underlining:<sup>3</sup>

(4) a. verb + object a-bal-a e-bi-kópò → à-bál-á é-bí-kópò 's/he is counting cups' H L %L H L

<sup>2</sup>There also is a marginal downstepped ꜜH which arises when two phonological phrases meet, the first ending in a HL falling tone, the second beginning with H.

<sup>3</sup> In (4) and subsequent examples %L marks an initial boundary tone which will be crucial to establishing the tone phrases in Luganda. In §3 we will see that this %L is restricted to postpause position in Lusoga.

Larry M. Hyman

> b. object + object a-bal-ir-a o-mu-limi e-bi-kópò → à-bál-ír-á ó-mú-límí é-bí-kópò 3sg-count-appl-fv 's/he is counting cups for the farmer' H L %L H L

The example in (4a) shows HTA applying from the direct object onto the verb, while (4b) shows HTA from the second object through the first object and, again, onto the verb (which is marked by the applicative *-ir-* suffix). In (5) we see that HTA also applies between a right-dislocated element (RD) and the verb and between RDs, again onto the verb:<sup>4</sup>

(5) a. verb + RD a-bi-bal-a e-bi-kópò → à-bí-bál-á é-bí-kópò s/he-them-count 's/he is counting them, the cups' H L %L H L b. RD + RD a-bí-mù-bal-ir-a o-mu-limi e-bi-kópò → à-bí-mù-bál-ír-á ó-mú-límí é-bí-kópò s/he-them-him-count-appl-fv H L %L H L

's/he is counting them for him, the farmer, the cups'

HTA does not, however, apply from the verb onto a constituent that precedes, whether the subject, an adverb, or a left dislocation (LD):<sup>5</sup>

(6) a. subj + verb o-mu-limi a-bi-láb-à → ò-mù-lìmì à-bì-láb-à 'the farmer sees them' H L %L H L b. LD + LD o-mu-limi e-bi-kópò a-bi-láb-à → ò-mù-lìmì è-bì-kópò à-bì-láb-à H L H %L L L %L H L 'the farmer, the cups, he sees them'

As indicated by the dashed underlining, (6a) shows that HTA does not apply from the verb onto the subject *ò-mù-lìmì*, which instead receives default L tones. Nor is

<sup>4</sup>Here and elsewhere it is important to note that, without exception, when two vowels meet across a word boundary, they coalesce with deletion or gliding of the first vowel and compensatory lengthening of the second. Thus, (5a) is pronounced [à-bí-bál éé-bí-kópò]. Thus, to answer one reviewer, there is no pause between a right dislocation and what precedes. For more on the phonological processes involved, see Clements (1986), Hyman & Katamba (1999), and references cited therein.

<sup>5</sup>Below in (9a) I will suggest that each such constituent is marked by an initial %L boundary tone which is responsible for blocking HTA.

### 12 In search of prosodic domains in Lusoga

there HTA from one LD onto another in (6b).<sup>6</sup> Instead, LDs and other pre-verbal constituents are marked off in a way that post-verbal constituents including RDs are not.<sup>7</sup>

Before accounting for this fact let us consider the opposite marking of dislocations in closely related Haya (Byarushengo et al. 1976: 201–202; Hyman & Katamba 1999: 155). In this language a /H-∅/ sequence is realized [HL-L] at the end of a tone phrase, e.g. in isolation:

$$\begin{array}{ccc} \text{(7)} & \text{a. } \text{ a-ba-kázi} \rightarrow \text{à-bà-kázi \text{\textquotedbl{}woman\textquotedbl{}}}\\ & \text{H} & \text{HL} \\ \text{b. } \text{ e-m-búzi} \rightarrow \text{e-m-búzi \textquotedbl{}goat(s)\textquotedbl{}} \\ & \text{H} & \text{HL} \end{array}$$

Noting this, we now see in (8) that Haya presents a near mirror-image of Luganda (we can ignore the "augment" initial vowel H on the nouns):


The base sentence is given in (8a). In (8b) we see that the /H/ of LDs is not affected, while in (8c), the /H/ of the verb and each RD becomes HL. RDs are thus each marked off, while LDs are not. The two languages are thus analyzed with the reverse nested structures in (9) (Byarushengo et al. 1976: 84; Hyman & Katamba 2010).

<sup>6</sup>Note in (6b) that HTA does not apply between *e-bi-kópò* 'cups' and *a-bi-láb-à* 'he sees them' because the former ends in a L tone. For HTA to apply, the preceding word must end with a toneless vowel.

<sup>7</sup>Again, not shown is the V#V coalescence that automatically applies between any words in sequence, including LDs and RDs, but does not affect the tonal discussion.

### Larry M. Hyman

(9) a. Luganda marks beginnings of complete Us

b. Haya marks ends of complete Us

In (9) I have labeled each complete syntactic utterance with U. Luganda thus marks the beginning of each U with a %L boundary tone, while Haya marks the end of each U with a final L% boundary tone, one of whose effects is to convert a penultimate H into HL. As Byarushengo et al. (1976) point out, each L% correlates with the end of a complete assertion.

Before moving on to the tone group, it should perhaps be pointed out that if the TP correlates with the phonological (or even intonational) phrase of prosodic domain theory, we don't expect to find a TP break within a simple noun phrase. While this is largely the case, there is a problem with numerals in Luganda:

(10) a. noun + adjective: a-ba-limi a-ba-nénè → à-bá-límí á-bá-nénè 'big farmers' H L %L H L b. noun + numeral: a-ba-limi ba-sátù → à-bà-lìmì bà-sátù 'three farmers' H L %L H L

12 In search of prosodic domains in Lusoga

As expected, HTA applies in (10a) from an adjective onto a preceding noun. However, HTA does not apply in (10b) from the numeral onto the noun. It is as if the noun is in a separate TP, as in the case of a preverbal constituent. I don't see any reason to think of numerals as predicative, such that 'farmers' would be preposed to the numeral (as a subject is to the verb marked by %L). While it is hard to motivate syntactically, the apparent need is for there to be an analogous %L separating the numeral from the preceding noun. This being said, Bantu languages that allow a subset of modifiers to be either pre- or post-nominal, e.g. demonstratives (van de Velde 2005), may also not phrase them with the head noun.

### **2.2 The TG**

The TG is a smaller domain in which the head V or N of the corresponding XP undergoes reduction when followed by an appropriate dependent with H tone. In Haya, the V or N undergoes deletion of its one or more H tones, while in Luganda, the V or N loses the L(s) of a H to L pitch drop, as the result of a process of H tone plateauing (HTP). For this to occur several conditions must be met, as schematized in (11) (Hyman & Katamba 2010: 75):

$$\begin{array}{cccc} \text{(11)} & \text{XP} & & \text{where:} & \text{(i)} & \text{X} \neq \text{[+Focus]} \\ & \bigwedge & & \text{where:} & \text{(ii)} & \text{Z} \neq \text{[+AUQMENT]} \\ & \text{X} & \text{YP} & & \\ & & \mid & \\ & & \text{Z} & \text{= a phonon logical word} \end{array}$$

In (11), Z stands for a phonological word (PW) which is not necessarily the head of YP (as when there is an empty head, e.g. 'we saw two'). The [±focus] feature refers to whether a verb tense, aspect, mood (TAM)/polarity is inherently focused. The following pair of examples shows that negation is inherently [+focus] (cf. Hyman & Watters 1984):

$$\begin{array}{ccccc} \text{(12)} & \text{a. tw-áà-làb-à} \rightarrow \text{tw-áà-làb-à bì-kópò} \text{ 'we saw cụps'} & \text{(Past}\_2) \\ & \text{H} & \text{L} & \text{L} & \text{H} \oplus \text{ } \otimes & \text{H} \end{array}$$

$$\begin{array}{ccccc} \text{b. te-tw-áà-làb-à} \rightarrow \text{tè-tw-áà-làb-à bì-kópò} \text{ 'we didn't see cụps'} & \text{(Past}\_2) \\ & \text{H} & \text{L} & \text{L} & \text{L} & \text{L} \\ \end{array}$$

In (12a) the Hs of the verb and object create an all-H plateau, requiring the Ls of the verb to be deleted (indicated by ∅). (As glossed, focus is on *bí-kópò* 'cups', marked by the absence of the augment *e-.*) However, H tone plateauing (HTP)

### Larry M. Hyman

does not apply in (12b), where the only grammatical difference is the negative marking on the verb.<sup>8</sup>

The [±augment] feature refers to whether a noun has an augment, usually an initial *e-, o-* or *a-*. As seen in (13a), HTP will not apply if the augment is present. (13b) shows that the augment is obligatorily absent after a negative verb (without any focus effect), as it was in (12b) above.

$$\begin{array}{ccccc} \text{(13)} & \text{a. tw-áá-làb-à} & \rightarrow \text{tw-áá-làb-à \(\underline{e}\)-bi-kópò \(\text{we saw cups}\)}\\ & \text{H} & \text{L} & \text{L} & \text{H} & \text{L} & \text{H} & \text{L} \\ & \text{b. t-tw-áá-làb-à} & \rightarrow \text{"tè-tw-áá-làb-à è} & \underline{e}\text{-bi-kópò \(\text{we didn't see cups}\)} \text{(Pasty)} \\ & \text{H} & \text{L} & \text{L} & \text{H} & \text{L} & \text{L} & \text{H} & \text{L} \\ \end{array}$$

Within the verb phrase the YP can be anything as long as it isn't [+augment] (or a RD). This includes an object NP, prepositional phrase, adverb etc. Within the noun phrase, plateauing occurs only in (some) compounding (Hyman & Katamba 2005) and before a possessive/genitive NP. In (14) we see that HTP does not apply between a noun and following adjective (possibly because adjectives are not YPs):

$$\begin{array}{ccccc} \text{(14)} & \text{a. } \text{N} + \text{A} & \text{e-bi-kọ̀pò} \rightarrow \text{e-bi-kọ́pò è-bi-néné} \text{ è big cups} \\ & \text{H} \text{ L} & \text{H} \text{ L} \\\\ \text{b.} & \text{bi-kọ́pò} & \rightarrow \text{te-tw-áá-làb-à bì-kọ́pò bi-néné} \\ & \text{H} \text{ L} & \text{L} & \text{L} & \text{H} \text{ L} \\ & \text{'we didn't see big cups'} \end{array}$$

While this could also be attributed to the augment on *è-bì-nénè* 'big' in (14a), the non-plateauing in the absence of the augment after the negative verb in (14b) unambiguously shows that N+A fails to become a TG. The examples in (15) show that a possessive pronoun and genitive noun will form a TG with the preceding head noun:


<sup>8</sup>HTA also does not apply since it must cross a word boundary, but it cannot do so when the preceding word ends L (vs. ∅).

12 In search of prosodic domains in Lusoga

In (15a) the final L of 'cups' is deleted as a result of plateauing with the HL of *by-ê* 'his/her'. The same occurs in (15b), where there is plateauing with the HL of proper noun, pronounced *Kàtààmbâ* in isolation.

It is important to note that the TG is a relation of the head and one word (Z) to its right. That is, the full YP in (11) does not join the head X to form the TG. This is illustrated in (16).

(16) a. tw-áá-làb-à → tw-áá-láb-á bí-kópò bi-nénè 'we saw big cups' (Past<sup>2</sup> ) H L L H ∅ ∅ H L H-L b. tw-áá-làb-à → tw-áá-làb-à bì-tábó bí-nénè 'we saw big books' (Past<sup>2</sup> ) H L L H L L H-L

In (16a) there is plateauing between the verb and 'cups', which maintains its H-L pitch drop before the H-L of the adjective 'big'. In (16b) the verb joins with *bi-tabo* 'cups', but since the latter is underlyingly toneless there is no possibility of H tone plateauing. Crucially, the verb cannot "see" the H of the adjective 'big'. The Hs that are observed on *bì-tábó* result from HTA within the larger TP domain.

However, there are cases where a H tone plateau can encompass several words. The following examples show that HTP can affect sequences of Head-Dependent words without respect to bracketing (Hyman 1988: 159):


The more common right-branching structure is observed in (17a). In this case N<sup>2</sup> +N<sup>3</sup> form a constituent which then joins N<sup>1</sup> . In the less common left-branching structure in (17b), N<sup>1</sup> +N<sup>2</sup> first form a constituent, which then joins N<sup>3</sup> . Although a single, three-word TG is formed, HTP does not apply to the whole constituent all at once. This is seen from the fact that an intervening toneless phonological word blocks HTP (Hyman 1988: 157). In the following examples, underlined Hs are from the application of HTA:

(18) a. e-bi-kópò by-àà mù-túúndá + bí-kópò [ N<sup>1</sup> [ N<sup>2</sup> N<sup>3</sup> ] ] H L H L 'cups of the cup-seller' (literally, seller-cups)

Larry M. Hyman

> b. mu-kúbà + bà-límí w-áá Kátáámbâ [ [ N<sup>1</sup> N<sup>2</sup> ] N<sup>3</sup> ] H L HL 'farmer-beater of Katamba' (literally, beater-farmers)

Even though the same right- and left-branching complex TGs are formed, HTP must progress on a word-by-word basis. For this reason I proposed that HTP be a domain-juncture rule of the following form (Hyman 1988: 158):

(19) L <sup>n</sup> → ∅ / [ TG[ … PW[ … H \_\_ ] [ H … ]PW … ]TG

Presented as a rule of L tone deletion followed by the fusion of the left and right H tones, the conception is that HTP occurs between PWs which are grouped together within a TG.<sup>9</sup>

In summary, the above and other Luganda facts potentially bear on multiple issues concerning prosodic domain theory vs. direct reference to syntax, the nature and number of prosodic domains (TP, TG, and ultimately the CG), the potential interaction between domains (domain juncture effects, nesting), and the interaction of prosodic domains with information structure (focus). With all of this hyper-activity in Luganda, we now turn to consider the equivalent structures in closely related Lusoga.

# **3 Prosodic domains in Lusoga (?)**

In Lusoga the most striking property is a historical process of H tone retraction (HTR) onto the preceding mora. In the following examples %L is an initial boundary tone, and H% is the declarative phrase-final boundary tone (which also occurs, but is variable in Luganda):


The infinitive in (20a) is lexically toneless, realized L-H-H-H-H by mapping %L to the first mora, and H% to the remaining moras. The Luganda realization is either the same, or all L if the variable H% is not chosen. In contrast, the verb root has an underlying tone in (20b). In this case the Luganda form is more straightforward:

<sup>9</sup>A perhaps equivalent alternative is that TGs are nested.

### 12 In search of prosodic domains in Lusoga

The verb base *-wúlir-* 'hear' has an underlying /H/ on its first mora, which as seen earlier in (3b) then conditions L tone insertion on the second mora. The remaining toneless moras receive L tone, unless H% is realized, in which case the output is *ò-kù-wúlìr-á*, with a final H. In Lusoga, instead, the H is realized on the preceding infinitive prefix *-kú-* followed by two L tone moras. The H tone of the verb root clearly has shifted onto the preceding mora. The historical derivation is presented in (3).

(21) *stage 1 stage 2 stage 3 stage 4* o-ku-ghúlir-a > o-ku-ghúlìr-a > o-kú-ghùlìr-a > ò-kú-ghùlìr-á 'to hear' o-ku-kálakat-a > o-ku-kálàkat-a > o-kú-kàlàkat-a > ò-kú-kàlàkát-á 'to scrape' H HL H L L %L H L L H% H H L H L L %L H L L H%

At stage 1 we start with a H tone on the first mora of the verb base. Stage 2 represents the L tone insertion rule that was discussed with regard to Luganda, but which characterizes both languages. Stage 3 is where H tone retraction (HTR) applies in Lusoga only. As seen, I have indicated a L tone phonological "trace" on the original root-initial H tone mora in stage 3.

While (3) is historically correct, the proposed synchronic analysis is that \*H is now /L/. In other words, the Lusoga tone contrast has become /L/ vs. ∅ (Hyman 2018):

(22) a. o-ku-ghùlir-a 'to hear' b. o-ku-kàlakat-a 'to scrape' L L

Two rules are needed to derive the correct outputs. The first is L tone spreading (LTS): an input L spreads one mora to the right:

$$\begin{array}{ccccc} \text{(23)} & \text{a. } \text{o-ku-ghụilir-a} & \text{'to hear'} & \text{b. } \text{o-ku-kálàkat-a} & \text{'to scrape'}\\ & \text{|}' & & \text{|}' \\ \text{L} & & \text{L} \end{array}$$

The second rule is H tone insertion (HTI): a H is inserted on a mora that precedes an input L:


As seen in (25) HTI has to be specified to insert a single H before a sequence of L morphemes (which we can assume to fuse into a single, multilinked L):

(25) aug-inf-it-him-us-give-appl-fv ò-kú-cì-mù-tù-ghà-èr-á → ò-kú-cì-mù-tù-ghè-èr-á L L L L %L H L H% 'to give it to him for us'

With this established, we now have two relevant criteria to test for postlexical domains in Lusoga:


To anticipate the demonstration, the conclusion we will reach is that syntactic constituency never blocks HTI or HTA, thereby raising two competing hypotheses:

(26) Hypothesis 1: Lusoga does not have the prosodic domains found in Luganda. Hypothesis 2: Lusoga has prosodic domains, but does not mark them the same as Luganda.

The significance of the first is that the mapping of syntactic structures into prosodic domains would not be universal in the sense of Selkirk & Lee's claim in the quote at the beginning of this paper. The problem with the second is that there is no empirical evidence to justify the prosodic domains. To see this we need to consider the Lusoga facts which correspond to Luganda's TP and TG. We first consider HTA, then HTI.

### **3.1 H tone anticipation (HTA)**

Unlike Luganda, the final H% boundary tone can reach the subject (as well as left-dislocations):

$$\begin{array}{ll} \text{(27)} & \text{a. Luganda} & \text{b-mù-lìmì } \[ \text{à-làgír-á} \\ & \text{\%L} & \text{\%L} \\ \text{b. Lusaga} & \text{\%-mù-lìmí } \[ \text{à-làgír-á} \\ & \text{\%L} & \text{\%-lìmí } \\$ \text{\%-làgír-á} \\ \end{array} \text{(\text{\`elem})}$$

### 12 In search of prosodic domains in Lusoga

Similarly, unlike Luganda, HTA can spread a lexical or inserted H tone onto the subject:

(28) a. Luganda ò-mù-lìmì [ à-bál-á é-mí-sótà 'the farmer counts snakes' %L %L H L H% b. Lusoga ò-mú-límí [ á-bál-á é-mí-sòtá (idem) %L H L H% The following examples show that H% and HTA can also reach left-dislocations:

$$\begin{array}{ll} \text{(29)} & \text{a. } \text{ o-mu-limi e-bi-tabo a-bi-balu-a} & \rightarrow \text{b-mu} \overbrace{\text{l-limí \'e-bù-tàbó á-bi-bù-bàl-a}}^{\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bàl-a)}^{\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bù-bù-tà}^{\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bù-bù-bù-tà}^{\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bù-bù-tà}^{\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bù-bù-tà}^{\text{(}\text{(}\text{(}\text{m}\text{)}\text{\'e-bù-tàbó á-bi-bù-bù-tà}^{\text{(}\text{(}\text{(}\text{(}\text{(}\text{}\text{(}\text{}\text{}}\text{)}\text{\'e-bù-tàbó}^{\text{(}\text{(}\text{(}\text{(}\text{}\text{}}\text{)}\text{\'e-bù-tàbó}^{\text{(}\text{(}\text{(}\text{}\text{(}\text{}}\text{)}\text{\'e-bù-tàbó}^{\text{(}\text{(}\text{(}\text{}\text{(}\text{}}\text{)}\text{\'e-bù-tàbó}^{\text{(}\text{(}\text{(}\text{}\text{}}\text{)}\text{\'e-bù-tàbó}^{\text{(}\text{(}\text{(}\text{}}\text{)}\text{\'e-bù-tà$$

Spreading of H% and HTA can also start from a right-dislocated element:

(30) a. a-bi-bal-a o-mu-limi e-bi-tabo → à-bí-bál-á ó-mú-límí é-bí-tábó 'he counts them, the farmer, the books' %L H% b. a-bi-bal-a o-mu-limi e-bi-kopo → à-bí-bál-á ó-mú-límí é-bí-kópò 'he counts them, the farmer, the cups' L %L H L

As in Luganda, HTA will apply only if the preceding word ends in at least one toneless mora, as in (31a). It will not apply if the preceding word ends in L, as in (31b).

$$\begin{array}{ccccc} \text{(31)} & \text{a. } \text{ o-kú-għuċir-a e-mi-sọtà} \rightarrow \text{�}\text{-kú-għuùllir-a è-mí-sọtà} \\ & \text{i} & \text{l}^{\prime} & \text{i} & \text{l}^{\prime} & \text{i} & \text{l}^{\prime} & \text{h} & \text{h} \\ & \text{H} & \text{L} & \text{H} & \text{L} & \text{\%L} & \text{H} & \text{L} & \text{H} \text{ L} \text{H}\text{\%} \\ & \text{'to hear snakes'} & & & & & \\ \text{b. o-kú-bòn-à e-mí-sọtà} & \rightarrow \text{�}\text{-kú-bọ̀n-à è-mí-sọtà} & & & \\ & \text{i} & \text{l}^{\prime} & \text{i} & \text{l}^{\prime} & \text{i} & \text{l} & \text{h} & \text{H} \text{ L}\text{H}\text{\%} \\ & \text{H} & \text{L} & \text{H} & \text{L} & \text{\%L} & \text{H} & \text{L} & \text{H} \text{ L}\text{H}\text{\%} \\ & \text{'to see snakes'} & & & & & \end{array}$$

From the above we can safely assume that HTA will apply no matter what the syntactic configuration. As stated in §1, this is quite surprising, given that almost all Bantu languages treat pre-verbal constituents differently from postverbal ones. In the next section we will see that HTI leads to the same conclusion. Larry M. Hyman

### **3.2 H tone insertion (HTI)**

In this section it will be briefly demonstrated that HTI can also apply across any syntactic boundary. Because nouns have a prefix which is underlyingly toneless, this will have to be demonstrated by means of other word classes, e.g. verbs and demonstratives. Consider first (32a), where the subject prefix *a-* is underlyingly toneless:

(32) a. o-mu-kàzi a-sek-a → ò-mú-kàzì à-sék-á 'the woman laughs' L %L H L H% b. a-ba-kàzi bà-sek-a → à-bá-kàzí bà-sèk-á 'the women laugh' L L %L H L H L H%

In this case the subject noun 'woman' ends with a L tone by virtue of the L tone spreading (LTS) rule. Therefore, the final H% cannot spread onto the subject noun. Compare this now with (32b), where the subject prefix /bà-/ has an underlying /L/. In this case HTI overrides LTS onto the final mora of the subject noun. In historical terms, the \*H of *\*bá-* has been anticipated from the verb onto the subject (cf. Luganda *à-bà-kázì bá-sèk-á*). The same facts are seen with left dislocations:


In (33a), H% does not reach the left-dislocated noun /e-bi-bàla/ 'fruits', since its /L/ spreads onto the final mora. In (33b), however, where the subject prefix /bà-/ has /L/ tone, HTI applies, and the H links to the final mora of the left-dislocated noun. In fact, HTI will apply across any sequence of words, provided that the preceding word does not end in a single /L/. This is illustrated in (34).


### 12 In search of prosodic domains in Lusoga

The proximate demonstrative /-no/ 'this, these' requires a L tone noun class agreement prefix, here /bì-/. As seen in (34a), the prefix conditions HTI on the final mora of 'fruits'. In (34b), on the other hand, the noun 'cups' ends in a single /L/ and hence HTI is blocked.

We thus arrive at the conclusion that syntactic constituency never blocks HTI or HTA. Returning to the two hypotheses in (26), we must address whether Lusoga recognizes prosodic domains at all – or whether it simply fails to give evidence of the syntax-to-prosodic domain mapping that Selkirk's (2011) matching theory predicts. Favoring universality, let's tentatively entertain the latter theory-driven position, Hypothesis 2 in (26): Lusoga has prosodic domains, but does not mark them. As was seen in §2, Luganda marks TPs with an initial %L, which can be taken to block HTA from the verb or between sentential preverbal constituents, each one of which begins a TP with its own %L. As Lisa Selkirk puts it (email of March 18, 2016):

In Lusoga, if HTA can extend from verb to subject and so on, it must be that there is no such L at the left edge of TP/ip. In other words a "domainless" HTA can spread its way leftward in Lusoga without a problem, but it would be blocked by the boundary L in Luganda.

Under this interpretation Lusoga would not have %L internal to the intonational phrase (IP), at most an IP-initial %L to predict the realization of post-pause toneless words such as *ò-kú-lágír-á* 'to command' in (20a). Such words require an initial L to precede the multiple Hs from H%. This could either be the effect of an IP-initial %L tone or is perhaps due to some kind of constraint against initial H.

### **3.3 The TG**

In §2 we saw that Luganda distinguishes two prosodic domains, the TP and the TG. The preceding discussion of HTA and HTI have both addressed the TP. In this section we show that Lusoga provides evidence for the TG only at the phonological word (PW) level. Importantly, there is no "phrasal" TG in Lusoga, i.e. no case of a head (X) + phonological word (Z) producing H tone plateauing (HTP). The examples in (35) show that the configurations that were seen to produce HTP in Luganda in (4a) and (15b) above fail to produce HTP in Lusoga:


Larry M. Hyman

> b. N + GenN e-bí-sàgho + bi-a=jeenga → e-bí-sàghò by-àà=jééngà L L %L H L H L 'Jenga's bags'

In (35a) the distant past affirmative verb is followed by an object noun which lacks the augment vowel since it is in focus, while (35b) consists of a genitive construction marked by the proclitic /bi-a=/ on the second noun. In neither case is there HTP as was observed in Luganda in (12a) and (15b), respectively.

While there is no case of a TG consisting of two phonological words (PWs), HTP does apply word-internally and between a PW and certain enclitics. The first is seen in a process of noun reduplication which introduces a derogatory meaning. Thus, when *ò-mú-pákàsí* 'porter' is reduplicated to *ò-mú-pákásí-pákàsì* 'a lousy ol' porter' the portion I have underlined shows HTP. A full derivation is provided in (36).


As seen, we begin with two identical stems /-pakàsi/, which both undergo LTS in (36a). HTI also applies twice in (36b). This is followed by HTP in (36c) and assignment of the boundary tones in (36d).<sup>10</sup>

More significantly for our purposes, (3.3) shows that HTP also applies between a possessive enclitic and the host noun:

<sup>10</sup>Although not exemplified in §2, HTP also applies within a word in Luganda.

### 12 In search of prosodic domains in Lusoga


The tones of the unpossessed nouns in the first data column, all of which have a H to L pitch drop, are shown after HTI and LTS have applied, but without a final phrasal H%. As seen, the L tone possessive enclitic /-è/ 'his/her' fuses with a noun class agreement prefix. When HTI applies to the preceding noun, HTP applies, and the H to L pitch drop is lost. (There is no final H%, since the forms end H-L.) As can be recalled from (15a), noun+possessive is an environment where HTP applies in Luganda as well. The examples in (38a,b) show that HTP also applies in verb+enclitic constructions:


In (38a), the locative noun class 17 enclitic *=kù* is used also as an attenuative marker. As seen, HTI applies followed by HTP on the host verb. The same is seen in (38b) with the interrogative enclitic *=cì* 'what'. However, for HTP to apply, the verb must have the same [−focus] status as was discussed in Luganda. Recall that negative verbs are [+focus], and hence although HTI applies before *=kù*, there is no HTP in (38c). In addition, there is no HTP with the corresponding nominal interrogative *=cì* 'which' (also paralleling Luganda; cf. *mù-kázì =cí* 'which woman?'):

Larry M. Hyman


As seen, the enclitic *=c*ì 'which' does not condition HTP (perhaps because it isn't a YP), but always inserts a H, potentially combining with a preceding L to create a downstepped ꜜH.<sup>11</sup>

The above shows that clitics work differently from full words in Lusoga. HTP occurs in the same environment as in Luganda, except that Z must be an enclitic. Thus, compare (40) with the corresponding Luganda configuration in (11).

$$\begin{array}{cccc} \text{(40)} & \text{XP} & & \text{(i)} & \text{X} \neq \text{[+Focus]} \\ & \bigwedge\text{} & & \text{where:} & \text{(ii)} & \text{Z} \neq \text{[+AUQMENT]} \\ & \text{X} & \text{YP} & & & \\ & & \mid & & \\ & & \mid & & \\ & & & \text{Z} & \text{anencitic} \\ \end{array}$$

We have seen that there are two kinds of X=cl: those which form a TG satisfying (40), hence HTP, vs. those which don't satisfy (40), hence occurring without HTP. I propose that the first has the structure of a nested phonological word [[ word ]PW =cl]PW, while the second has the structure of a clitic group [[ word ]PW =cl]CG. If correct, this would mean that HTP only applies within a PW whose definition, however, is subject to the syntactic characterization in (40). A historical conjecture would be that HTP started out in individual words (X), then expanded to X = Z, then X # Z, always meeting the configuration and conditions (i) and (ii) in (40). Note in this regard that enclitics only condition HTP with their lexical host, not with each other:

(41) a-ta-a=muu=kuu=cii buli lunaku → á-tá-á=ꜜmúú=ꜜkúú=ꜜcí bùlì lúnàkú H L s/he-puts=in=a.little=what every day H L H L H LH L H L H% 'what does s/he put a little of in every day?

<sup>11</sup>Recall from (34b) that the inserted H cannot be assigned to a single L when it occurs between two phonological words.

### 12 In search of prosodic domains in Lusoga

In Lusoga, all enclitics are /L/, requiring HTI on the preceding mora. They also differ from full words in preventing a preceding long vowel from undergoing final vowel shortening (cf. 'tree' and 'which tree?' in 39). The unavoidable conclusion is that Lusoga tonology is not sensitive to prosodic domains above the (nested) PW level.

## **4 Two outstanding problems**

I would like to end the coverage of tonal phenomena by considering two outstanding problems. The first is a return to numerals, this time in Lusoga. We saw in (10b) that Luganda doesn't allow HTA from a numeral onto the preceding noun. There is an analogous issue in Lusoga, which is that numerals which begin with /L/ do not condition HTI (vs. demonstratives, which do). This is seen in (42).


We see this between a numeral and noun in (42a) and between a numeral and a preceding verb in (42b). We know that /bì-bìri/ has a /L/ on its prefix because of the augmented form, *é-bì-bìrí* '(the) two', where the normally L augment receives a H from HTI. Positing an initial %L was said to be unmotivated for Luganda, but is even more so in Lusoga, which otherwise doesn't have clause-internal %L. This is, however, the only situation I have discovered to date where a /L/ does not trigger HTI.

The second issue also characterizes both languages, this time in exactly the same way. The question is why HTA always has to leave at least one L tone behind. This is seen in the Luganda sentences in (43a,b).

(43) a. verb + object a-láb-à bi-tabo → à-láb-à bì-tábó 's/he sees books' H L %L H L H%

### Larry M. Hyman


As seen, the H% in (43a) is anticipated onto the preceding mora, and yet the prefix *bì-*stays L. In (43b), the H of /bi-kópo/ 'cups' is anticipated up to the second syllable of toneless /mu-limi/ 'farmer', leaving the prefix L. In addition, HTA does not apply from the host onto proclitics, as seen in (42c). The question is: What's wrong with prohibited L to H sequences in the following corresponding outputs?

(44) a. \* tè-y-à-láb-à ] [ bí-kópò 's/he didn't see cups' H L H L b. \* te-y-à-láb-à ] [ bí-tábó 's/he didn't see books' H L H c. \* by-àà= [ bá= [ kátáámbâ 'those of the Katambas' %L H L

In (44a) we see that HTA has applied word-internally. As we have said, HTA can only apply if it can cross a word boundary onto a ∅ mora. The problem in (44b) is that HTA should leave one L behind. (43b shows the same with a lexical /H/.) Finally, (44c) shows that a proclitic doesn't count as "crossing a word boundary". Why should all of the above examples prohibit HTA from hitting every available toneless mora on its leftward path?

The answer is that the ungrammatical forms in (43) have the prohibited configuration in (45):

$$\begin{array}{ccccc} \text{(45)} & \* & \mu & & \text{(NoJupM)}\\ & \mid & & & \\ & \mid & & & \\ \text{L} & \_{\text{PW}} \left[ & \stackrel{\text{H}}{\text{H}} \end{array} \end{array} \tag{\text{NoJupM}}$$

The prohibited sequence is one where one would jump from a L to a H across a PW boundary. This NoJump constraint has the following "conspiratorial" effects on HTA: (i) It stops the H from reaching the first mora of a word, which could then be preceded by a (%)L; (ii) It stops the H from reaching the first mora of a proclitic, which would have be PW-initial, preceded by a (%)L. NoJump is the kind of OT constraint that can of course be dominated by another constraint, e.g.

### 12 In search of prosodic domains in Lusoga

faithfulness to an input /H/, as in Luganda *tè-y-à-láb-à bí-bàlá* 's/he didn't see fruits', where *bí-bàlá* 'fruits' exceptionally has a /H/ prefix. The constraint in (45) can stop the creation of a L PW[ H output, but cannot remove a word-initial H tone. Of course the remaining question is why Luganda and Lusoga bother to implement HTA at all, since the affected moras would otherwise have become L, presumably by default. For this Selkirk (2016) has proposed the constraint HTSleft: H tone wants to spread to the left as far as it can go. The constraint in (45) puts a check on HTS-left: It spreads as far as it can, but stops short if the result would be a L PW[ H sequence.

## **5 Conclusion**

To summarize the findings for Lusoga, there is no empirical evidence for a prosodic domain corresponding to the TP in Luganda. Specifically, there is no evidence that what precedes the verb is treated differently from what follows it. The domain corresponding to the TG in Luganda does exist but is more restricted, being limited to certain word=enclitic combinations.<sup>12</sup> At this point one might ask what other evidence there might be for prosodic domains in Lusoga. Two possibilities are intonation, which has thus far not yielded anything concrete, and instrumental phonetic studies, e.g. on segment durations, which I have not done – and which in any case would take us beyond my question, which had to do with whether there are discrete, categorical effects of prosodic domains in Lusoga.

I would like to conclude with some further thoughts about Lusoga in terms of linguistic typology, defined for our purposes as the study of how languages are the same vs. different. First, since there is no known empirical evidence to choose between the two hypotheses in (26), Lusoga is not a counterexample to the claim that syntax–phonology "matching" is universal. Second, nothing looks syntactically or prosodically aberrant in Lusoga. Rather, it is the lack of interest that Lusoga shows for prosodic constituents that is striking, particularly from a Bantu point of view. In fact, Lusoga provides the missing "cell" in the typology of whether LDs and RDs phrase with the main clause in Bantu:

<sup>12</sup>As pointed out to me by Jenneke van der Wal (p.c.), it is possible to treat such word=enclitic combinations as recursive phonological words, i.e. [[ word ]PW clitic]PW, since they share the same tonal properties as the lexical phonological word.

### Larry M. Hyman

We have already seen that Luganda and Haya are mirror images of each other as far as whether LDs (Luganda) or RDs (Haya) are marked off from the main clause. Chichewa has been reported to mark off both LDs and RDs (Downing & Mtenje 2011: 1966–1967). Finally Lusoga provides the fourth possibility: Neither LDs nor RDs are marked off.

The Lusoga distinterest in marking prosodic domains is remarkable from a Bantuist and perhaps universalist point of view. However, it has long been known that languages vary in how much they "care" about some of the "best bets" in phonology. Lusoga can now be added to the list of languages which have shown a disregard for one or another prosodic property:

	- b. Word stress: Bella Coola cares very little if at all about highlighting one syllable per word (Newman 1947: 132)
	- c. Prosodic domains: Lusoga cares very little if at all about reflecting syntactic constituency in the post-lexical phonology (this study)

For me, typology should not only determine the different ways in which universal linguistic properties can be reflected in the grammar of a language, but also how well a grammar can get along without signaling them at all.

## **Abbreviations**


# **Acknowledgements**

This article is a revision of a paper presented at the workshop on the effects of constituency on sentence phonology, University of Massachusetts at Amherst,

### 12 In search of prosodic domains in Lusoga

on July 30, 2016. I would like to thank the participants for their questions and comments. I am especially endebted to extensive comments received from two anonymous reviewers.

## **References**


### Larry M. Hyman


# **Chapter 13**

# **Apparent violations of the final-over-final constraint: The case of Gbe languages**

Enoch O. Aboh

University of Amsterdam

In a series of recent talks and articles, Theresa Biberauer, Anders Holmberg, Ian Roberts, and Michelle Sheehan argue that the final-over-final condition (FOFC) is an absolute universal regulating structure building. Yet, many languages deviate from FOFC thus suggesting that this condition is not "surface-true". The question therefore arises what factors make languages violate FOFC on the surface. In order to answer this question, we need a typology of FOFC-violating languages, as well as a detailed description of such violations. In this short essay, I describe FOFC violations in Gbe and some creoles, while relating the observed phenomena to some theoretical questions they raise.

# **1 Introduction**

In a series of recent talks and articles, Theresa Biberauer, Anders Holmberg, Ian Roberts, and Michelle Sheehan, analyse a very strong tendency across human languages which appears to be indicative of an absolute universal regulating structure building: The final-over-final condition/constraint (FOFC) defined as in (1), and further discussed in Sheehan et al. (2017), henceforth SBRH.

	- a. A head-final phrase αP cannot immediately dominate a head-initial phrase βP if α and β are members of the same extended projection.

Enoch O. Aboh. 2020. Apparent violations of the final-over-final constraint:The case of Gbe languages. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 277–292. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972852

### Enoch O. Aboh

b. \*[αP [βP β γ] α], where β and γ are sisters and α and β are members of the same extended projection.

FOFC is not bidirectional since the reverse does not hold: "a head-initial phrase αP may dominate a phrase βP which is either head-initial or head-final, where α and β are heads in the same extended projection" (cf. Biberauer et al. 2014: 171).

Accordingly, FOFC makes strict predictions both in terms of surface typological variation and possible outcomes of language change (cf. Biberauer et al. 2009). For instance, FOFC predicts the structures in (2a–c) to exist with the exclusion of the pattern in (2d) (cf. Biberauer et al. 2014: 171).

In its strong version, the generalisation in (2) could suggest that the human mind "prefers" harmonic structures (2a,b), tolerates one type of disharmonic structure in (2c), and totally excludes the disharmonic structure in (2d). This view is obviously misleading since, looking at surface form only, disharmonic structures abound in languages. This is, for instance, the case in Kwa (see the discussion below), and in Sinitic (cf. Hsieh & Sybesma 2007, Sybesma & Li 2007, Chan 2013 and references therein). On the basis of his database, Dryer (1992) concludes that completely harmonic languages actually represent a minority. Instead, the common cross-linguistic pattern seems to be that languages are rigidly consistent in some domains, but less so in other domains. FOFC therefore seems to strictly constrain certain core structures only. Given its surface flexibility, one

### 13 Apparent violations of the final-over-final constraint

could consider the FOFC effect to derive from processing constraints facilitating parsing. If one were to adopt Hawkins's (1983) *cross-category harmony*, defined in terms of head dependent order preferences, or his 1990 *early immediate constituent* principle suggesting fast recognition of the immediate constituents of a mother node, its seems intuitive that the parser would prefer orders in which heads and dependents can be easily identified. In this regard, learning biases seem to favour certain orders over others. Under this view, FOFC would be essentially a third factor phenomenon, required by "principles of efficient computation" in terms of Chomsky (2005) (cf. Walkden 2009 for discussion).

SBRH (2017) argue for a different view. FOFC is a property of structure building. At this point, the question arises how the notion of "harmony" relates to structure building and computation. If Merge applies to (categorial) features only, and embeds no spell-out specification, how can we decide that (2d) is computationally disharmonic compared to (2a)? If on the other hand, one assumes Grimshaw's (1991) extended projection and some version of Kayne's (1994) *linear correspondence axiom* (LCA), as SBRH (2017) do, then disharmonic structures can be understood as involving featural mismatches within a functional sequence. Under this latter view, the bulk of apparent counterexamples to FOFC would derive from movement: structures obey FOFC underlyingly, even though movement operations may lead to apparent surface violations.

It seems to me that two fundamental questions arise here that merit further investigation: The first question deals with the relation between the LCA and FOFC, and why the language faculty (in the narrow sense, cf. Hauser et al. 2002) would involve such apparently competing linearization mechanisms. The issue is not trivial as it relates to the question of the place of linearization within the human faculty of language (cf. Chomsky et al. 2019 and Kayne 2018 for discussion). I will not address this question any further in this essay. The second question I will be concerned with instead is of a typological nature. Why do some languages seem to violate FOFC massively on the surface form? If Dryer (1992) is right, such violations would be the norm, while FOFC compliant languages would be the exception. Why would this be if FOFC holds on structure building? Why would languages systematically diverge from core principles imposed by the computational system? For example, there does not seem to be such a massive violation of the extended projection principle, a potential universal of natural languages constraining structure building. In order to understand FOFC apparent violations therefore, we need to take a closer look at the empirical facts.

As I will show in the following paragraphs, the Gbe languages (and for that matter many Niger-Congo languages) involve apparent violations of FOFC. I

### Enoch O. Aboh

have discussed many of these patterns in previous work and proposed an analysis in terms of the LCA. Since its formulation in the early 2000s, the tenants of FOFC have also reported similar patterns cross-linguistically and have suggested various analyses to account for them (see SBRH 2017 and references therein). For instance, final negative markers, such as instantiated in the Fongbe example in (3a), can be analysed as not being merged within the functional sequence of TP (cf. Biberauer et al. 2014). That such a view is indeed adequate can be shown by the fact that the Fongbe yes-no question in (3b) displays a similar sentence-final particle, which Aboh (2010a,b) shows interacts with final negation in Gbe, as indicated by example (3c). In this example, the negative particle precedes a focus marker which in turn precedes the question particle.

(3) Fongbe


Facts like these led Aboh (2010a) to propose that the sentence-final negative particle belongs to the C-domain in Gbe. These data from the Gbe languages, already show that FOFC as formulated in (1) is certainly not "surface-true". Can we, however, claim that FOFC constrain the underlying structure? Given that SBRH (2017) adopt Grimshaw's (1991) notion of extended projection, we can answer this question only if we are able to characterize precisely the featural bundle of the different heads within the functional sequence of the left periphery in the Gbe languages. Though there is now a significant body of literature on the complementizer system of the Gbe (and other Kwa) languages, it is reasonable to say that we still do not have a fine-grained map of the featural specifications of Ctype heads in these languages, and we do not know how learners acquire these features.

This last question becomes even more critical when considering acquisition in contact situations. Indeed, if FOFC is an inviolable condition, as suggested by

### 13 Apparent violations of the final-over-final constraint

SBRH (2017), one could imagine that the primary linguistic data (PLD) that learners are exposed to would not generally contain systematic cues for them to derive FOFC-violating grammars. Put differently, learners must have a way of deducing underlying FOFC-compliant structures from massively FOFC-violating surface forms. One would therefore expect superficial FOFC-violating orders (e.g., VO-Aux, VO-question particle, VO-Neg) to be unstable and eventually lost in contact situations. This expectation, however, is not met in the case of certain creole languages. Indeed, creole languages which emerged in colonial settings involving enslaved Niger-Congo learners (i.e., speakers of Kwa and Kikongo) inherited typical Niger-Congo disharmonic structural properties and therefore display comparable FOFC surface violations.

Since the original formulations of FOFC, I have discussed some of these surface FOFC violations with Ian Roberts and Theresa Biberauer. I was therefore only partially surprised on June 3, 2016 at 3:45pm, when I received a mail from Ian, which read as follows:<sup>1</sup>

I'm looking at languages with N-A-Num-Dem U20 order in the DP to see what (if any) clausal word orders they correlate with. Am I right in thinking that Gungbe has head-initial order in the clause? According to WALS, it has head-final question particles though. Is that correct? In that case it looks like an apparent FOFC-violator.

As suggested in Ian's message, the discussion on sentences under example (3) already indicated that the Gbe languages involve clause-final particles that encode negation (3a), interrogation (3b) or a combination thereof (3c). The following sentence further shows that these languages display noun-adjective-numeraldemonstrative order as illustrated in (4). Further note that within the DP, the determiner and the plural marker occur to the right edge (see Aboh 2004a,b and references therein for discussion):

<sup>1</sup> I am always excited by mails from Ian who also happens to be one of my favourite teachers and now very good colleague and friend. Ian introduced me to diachronic syntax at a time I had no idea such a thing existed. Actually, he has in various ways inspired my recent work on language contact and change. In addition, as his student, I liked his French accent at a time when as a *Béninois* trying to make sense of *Français Genevois*, I wondered what French and African politicians meant by "la francophonie". What's the point if I have hard times understanding both *Genevois* and my French L2 speaker teacher of diachronic syntax? How can we account for such a variation in a principled manner? These questions obviously led me to my current work on *hybrid grammars*, a concept that is actually not very far from work that Ian has done in collaboration with Robin Clark in the early 90s. But let us return to our current topic of discussion.

### Enoch O. Aboh

(4) Gungbe [ Òxwé kpɛ̀ví àwè

house small two éhè dem lɔ́ det lɛ́ pl ] jró please mì. 1sg-acc 'I like these two houses.', lit. 'These two houses please me.'

With regard to Ian's message therefore these examples indicate that Gbe languages may constitute counter-examples to FOFC. Sheehan (2013) claims that the number of such FOFC-violating languages is rather restricted. Since the Gbe languages exhibit right edge (or final) functional elements both in the nominal and clausal domain, it is important to look at the facts closely in order to determine whether these languages represent genuine FOFC violations or not. Given the importance of FOFC in the literature, we need to better understand such cases of apparent violations in order to find out whether the principle holds of structure building or whether it relates to surface phenomena deriving from processing (cf. Hawkins 1983; 1990; Walkden 2009). In order to make this first step, the following sections are meant to present more data from Gbe and some creoles which appear to be FOFC violations.

Recall from the formulation of FOFC in (1) that it excludes structure (2d): no language should exist in which a consistent head-initial structure is dominated by a head-final structure. Under FOFC therefore a structure like the one in (3b) cannot have the underlying representation (5a), but must be analysed as in (5b) in which the complement of the Interrogative functional projection InterP raises to its specifier position. In these representations, the sentence-final floating low tone expresses a question particle that takes the clause as complement. It is worth noting, however, that Aboh (2004a), Aboh & Pfau (2011) propose the same analysis under the LCA, hence the necessity to tease FOFC-related and LCA-related effects apart.

### 13 Apparent violations of the final-over-final constraint

It appears from the examples in (3) and (4) that the Gbe languages, like many Niger-Congo, display disharmonic structures, as represented in (2c) and (2d), in various components of their grammar (e.g., TP, CP, PP). Likewise, studies on creole languages have shown that some creole languages, which emerged from the contact between Gbe languages and French (e.g., Haitian Creole), or Gbe languages and English (e.g., Sranan, Saramaccan), exhibit similar disharmonic structures in areas of their grammar. Together these facts suggest that such apparent violations of FOFC are not isolated phenomena, and therefore require some explanation. Such an explanation can only be based on a precise description of the facts. In what follows, I take this first step and illustrate the main contexts in which Gungbe apparently violates FOFC, and provide comparable examples in Haitian Creole and Suriname creoles (e.g., Sranan and Saramaccan). These creoles emerged in the 17th century colonial plantations in Suriname and Haiti where thousands of enslaved African speakers of Niger-Congo languages were deported to the Americas and came into contact with the languages of European their colonists, namely French in Haiti and English and Dutch in Suriname.

## **2 Initial-over-final in Gbe**

Aboh (2010c) reports that Gungbe involves two types of adpositions labelled P1 and P2. Elements of the type P1 generally derive from posture or locative verbs, while items of the type P2 derive from nouns expressing landmarks or bodyparts. P1 projects a head-initial structure as indicated in (6a). P2 on the other hand projects an apparent head-final structure as in (6b). When P1 and P2 cooccur, P1 must precede the phrase headed by P2, as indicated by example (6c) further described in (6d).

	- a. Súrù Suru zé take kwɛ́ money [ xlán P1 mì 1sg ]. 'Suru sent me some money.'
	- b. Súrù Suru xɛ́ climb [ só hill lɔ́ det jí P2 ]. 'Suru climbed on top of the hill.'
	- c. Súrù Suru nyìn throw àgán stone [ xlán P1 [ só hill lɔ́ det jí P2 ]]. 'Suru threw a stone on top of the hill.'

Enoch O. Aboh

$$\begin{array}{ll} \text{d.} & \text{P1P} \\ & < \searrow \\ & \text{P1} \\ & \text{l.} \\ & \text{xlán} \\ & \text{DP} \\ & \text{só lfs} \end{array} \qquad \begin{array}{ll} \text{P1P} \\ & \text{P2P} \\ & \text{l.} \\ & \text{P3P} \\ & \text{só lfs} \end{array} \qquad \begin{array}{ll} \text{P1P} \\ & \text{P2P} \\ & \text{P3P} \\ & \text{l.} \\ & \text{l.} \\ \end{array}$$

Note that in this example, both the DP inside P2P and P2P itself display a headfinal structure embedded under the head-initial P1P. Biberauer (2016) discusses these examples and concludes that the determining factors allowing these apparent FOFC violations could be the lower structural position of P2 compared to P1 as represented in (6d). Furthermore, P1 and P2 are categorially distinct: the former developed from verbs, while the latter developed from landmark nouns (cf. Aboh 2010c). While this view is plausible, one would need to find out how it squares with Aboh's (2010c) subsequent suggestions that elements of the type P2 should be analysed as heading a predicate within a possessive phrase (which according to him is typical of such locative expressions). The idea being that a sequence like *só lɔ́ jí* in (6b) should be analogised to *the mountain top* in English, in which *jí*, expressing P2, heads a possessive predicate. If this view is correct and if we maintain the notion of extended projection as argued for in SBRH (2017), then both P1 and P2 belong to the same extended projection, and we would have to demonstrate how they are categorially distinct.

## **3 Final-over-initial in Gbe**

The discussion above about the yes–no question particle already showed that Gbe languages involve instances of final-over-initial disharmonic orders within the clausal left periphery (cf. Aboh 2016b for further discussion). In what follows, I show that similar disharmonic orders are found within the TP too. In Fongbe, for instance, the so-called completive aspect can be expressed by complex structures in which two apparent verbs circumvent an object (cf. da Cruz 1995; Aboh 2009; van den Berg & Aboh 2013).

	- a. Kɔ̀kú Koku wà do àzɔ̌ work ɔ́ det fó finish 'Koku finished doing the work.' b. Kɔ̀kú Koku ɖù eat mɔ̀lìnkún rise ɔ́ det vɔ̀ finish 'Koku finished eating the rice.'

### 13 Apparent violations of the final-over-final constraint

Under the assumption that the final verb is comparable to an auxiliary or aspect marker of some sort, these sequences would be akin to [VO]-Aux order which is banned in Germanic (cf. Biberauer et al. 2014: 173). Da Cruz (1995) analysed these constructions as instances of serial verb constructions arguing that, in these constructions, the final V is a lexical verb with the same thematic properties as in the examples in (8) in which these verbs select for an internal argument.

(8) Fongbe (da Cruz 1995: 363)


In recent work, however, van den Berg & Aboh (2013) argue that these constructions should be analysed similarly to equivalent constructions in Gungbe which do not involve two apparent verbs and in which the final position is realised by the quantifier meaning *kpó* 'all'.

(9) Gungbe


In terms of this proposal, the Gbe languages involve a TP-internal functional projection that expresses event quantification and may be spelled out by a verb root or a quantifier root that merges in its head. Under this view therefore, the Fongbe and Gungbe sentences in (7a) and (9a), respectively, can be described as in (10) in which the event quantifier merges under F and takes a head-initial VP.

### Enoch O. Aboh

If representation (10) corresponded to the underlying structure then this and similar examples would be genuine violations of FOFC. Alternatively, however, one can argue along the lines of van den Berg & Aboh (2013) that the functional element heading event quantification is head-initial, but its complement must move leftward, presumably to its specifier position, as in (11). In terms of Aboh (2004a,b; 2010a), this event quantifier head belongs to the class of markers in Gbe whose complements must raise to their specifier position.

Under this view and assuming that Gbe languages are underlyingly head-initial no issue arises, but this conclusion is not immediately obvious if we assume FOFC and if linearization is not part of core syntax.

## **4 FOFC in language contact and change**

The examples discussed thus far indicate that Gbe languages involve the disharmonic orders in (2c) and (2d). These languages therefore seem to violate FOFC, on the surface. As suggested in previous paragraphs, one could hypothesise that such apparent violations of FOFC are unstable in contact situation because FOFC constrains structure building. Alternatively, one could also imagine that the process being so robust in Gbe (and other Kwa), prevails in contact situations involving Gbe or similar Niger-Congo languages and European languages such as French or English. It is the latter scenario that characterizes certain Atlantic creoles. These new languages display disharmonic orders in areas of their grammar in a way comparable to Gbe. This is the case in Haitian Creole spoken in Haiti, Sranan and Saramaccan spoken in Suriname. These languages developed in the Caribbean in the late 17th and early 18th century during European colonial expansion (cf. Aboh 2015 and references cited there). We now face the crucial question of why, during acquisition in such multilingual contexts, disharmonic structures win over harmonic ones even though the computational system favours the latter.

### 13 Apparent violations of the final-over-final constraint

### **4.1 Initial-over-final within PP: Sranan**

Just as Gbe languages exhibit P1 and P2 categories with apparent different headness properties, one finds equivalent adpositions in Early Sranan (12), as well as in other Suriname creoles (cf. Bruyn 2003 and references cited there).

(12) Sranan (Bruyn 2003: 32)

Sinsi since a 3sg komm come *na* P1 hosso house *inni* P2 … 'Since she entered the house …'

The surface string in (12) indicates that like in Gbe, Sranan P1 is head-initial and takes a complement which is head-final. Aboh (2010c, 2015, 2016a, 2017) discusses these patterns as well as other varying word orders found within the PP in these creoles and shows how they derive from a recombination of syntactic features selected from Gbe-languages and from English.

### **4.2 Final-over-initial within the DP: Haitian Creole**

Similar recombination is found within the DP in Haitian Creole (Aboh & DeGraff 2014; Aboh 2015). This language exhibits both prenominal and postnominal adjectives. The definite/specificity marker must follow the noun phrase, while the indefinite marker *yon* must precede:

	- a. Nana Nana vann sell gwo big wòb dress la det 'Nana sold the big dress.'
	- b. Nana Nana vann sell wòb dress jòn yellow la det 'Nana sold the yellow dress.'
	- c. Mwen 1sg te ant wè see yon det moun person 'I saw someone.'

Clearly, the distribution of adjectives in Haitian Creole is similar tothat of French adjectives. Under Cinque (2010), French and other Romance languages which exhibit similar distributive properties involve head-initial structures and the relative position of adjectives (i.e., pre- vs post-nominal adjective) is derived by N(P)-movement. Taking this as our starting point, it must be the case that the

### Enoch O. Aboh

post-nominal determiner-like element in Haitian Creole dominates a head-initial structure. This view is further corroborated by the fact that unlike adjectives, possessive pronouns, demonstratives as well as the number marker follow the Gbe head-final order as illustrated by example (14).

(14) a. Haitian (Lefebvre 1998: 78) krab crab mwen 1.poss sa dem a det yo pl 'these crabs of mine' b. Gungbe (Aboh 2004a,b) àgásá crab cè 1.poss éhè dem lɔ́ det lɛ́ pl

'these crabs of mine'

Yet, example (13c) clearly shows that the indefinite determiner must precede the noun, suggesting a head-initial pattern similar to French *une personne* 'a person'. Again, what we see here is a recombination of the Gbe disharmonic order with French harmonic order with mixed headness properties, leading to apparent FOFC-violations.

### **4.3 Final-over-initial within TP: Sranan**

In the preceding paragraphs, I showed that Gungbe, and Gbe languages in general, involve event quantifiers which, on the surface, seem to exhibit a head-final structure, even though they select a head-initial VP complement. Similar constructions are found in the Suriname creoles as well. An example from early Sranan is given in (15) in which the so-called completive marker, *keba*, follows the verb.

(15) Sranan

yu 3sg syi see tok, yet nownowdei now.red-day mi 1sg leri learn *keba* already taki that a the 'oe' 'oe' musu must de be ini every wan one lo lo geval case wan a 'u'. 'u'

'You see, right, nowadays I have learned (I know) that the 'oe' must be (written) as 'u' in any case.'

These constructions are discussed in van den Berg & Aboh (2013) who propose an LCA account in the lines of representation (11) above. In terms of this analysis, *keba* (also realised sometimes as *kba*, *kaba*) is equivalent to the Gbe event

### 13 Apparent violations of the final-over-final constraint

quantifiers, in that it heads a functional projection within TP that takes the VP preceding it as complement. The latter must raise to [spec FP] to be licensed as described in (11).

The preceding paragraphs show that the Gbe languages and some creoles involve a significant body of syntactic patterns which systematically violate FOFC on the surface. These patterns are found within the determiner phrases, adpositional phrases, tense or aspect phrases as well as within the complementizer system. With regard to aspect phrases, for instance, the discussion on event quantifiers suggests that these languages involve some event quantifier that can project above the VP and surface as head-final structure even though the embedded VP is head-initial. Assuming that these event quantifiers are aspectual in nature (as commonly accepted in the literature), they are comparable to aspect markers which, in many languages, are expressed by various auxiliaries. Accordingly, we reach the description that these languages appear to exhibit the order [VO]–Aux/Asp in which a head-initial VP precedes an aspect marker or auxiliary which appears to be head-final. Since it is the absence of the [VO]-Aux order in Germanic which led to the postulation of FOFC (cf. SBRH 2017), one wonders why these languages display a sequence in surface form that is banned in Germanic? If the ban in Germanic holds on surface form, why does it not apply to Gbe and similar languages as well? Given such sharp discrepancies between Gbe languages (Niger-Congo), some creoles, and Germanic, the question arises what fundamental aspect of Human Language Capacity explains FOFC, and the observed cross-linguistic variation. Theresa Biberauer's chapter in SBRH (2017) addresses some of these questions, but I hope that the data provided here will allow further research in this domain.

## **Abbreviations**


Enoch O. Aboh

## **References**


### 13 Apparent violations of the final-over-final constraint


### Enoch O. Aboh


# **Chapter 14**

# **Revisiting the lack of verbal** *wh***-words**

# Aritz Irurtzun

CNRS-IKER (UMR 5478)

I propose that the cross-linguistic lack of verbal *wh*-words derives from the illformed logical form (LF) representations they would generate: verbs are predicates of eventualities and predication (≈logical attribution) and questioning are incompatible. I revisit the literature on interrogative pro-verbs arguing that there are no genuine interrogative verbs unrestrictedly ranging over any eventuality type. Last, I argue that my proposal also predicts the universal lack of other conceivable interrogative elements such as adpositions or tense markers.

# **1 Impossible questions**

One of the prima facie most puzzling cross-linguistic constraints is the apparent lack of genuine verbal *wh*-words asking about the nature of the eventuality at stake.

For an illustration, let us take a situation like the one in Figure 14.1, the assassination of Julius Cæsar (as depicted in the 1798 painting by Vincenzo Camuccini).

Figure 14.1: *La morte di Cesare* (V. Camuccini)

Aritz Irurtzun. 2020. Revisiting the lack of verbal *wh*-words. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 293–316. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972854

### Aritz Irurtzun

Such an event can be described with the proposition expressed by (1), a classic example discussed by Davidson (1967) and many others:

(1) Brutus stabbed Cæsar.

But besides asserting what happened, there is a variety of questions we may ask about the event: questions about the killer (2), the killed one (3), the location of the event (4), the moment that it took place (5), the way it was performed (6), or the motives of the assassin (7), among others:


All of them are perfectly grammatical questions. However, there is a type of question that we cannot directly ask; we cannot ask questions on the nature of the eventuality itself. There is simply no interrogative pro-verb, so that we can ask questions such as (8):

(8) \* *Whxyzed* Brutus Cæsar? 'What type of event happened such that it has Brutus as external argument and Cæsar as internal argument?'

We could generalize this observation as in (9):

(9) *Generalization*: There are no verbal *wh*-words ranging over any eventuality type.<sup>1</sup>

This is such an obvious fact that it has seldom been discussed in linguistics (see a few exceptions in Hagège 2003; 2008; Cysouw 2004; Idiatov & van der Auwera 2004).

The way many languages (including English) have of circumventing the lack of verbal *wh*-words is to decompose the transitive pro-verb of (8) into a dummy *do* verb and an interrogative pronoun as its direct object, as in (10):

(10) *What* did Brutus *do* to Cæsar?

<sup>1</sup> See §4 for discussion.

### 14 Revisiting the lack of verbal *wh*-words

In this article, I discuss the nature and strength of this constraint and propose a formal account for it based on general legibility constraints (representational well-formedness conditions) at the interface between language and the conceptual–intentional (C–I) systems which would be violated by genuine interrogative pro-verbs. After briefly discussing the cross-linguistic availability of interrogative pro-verbs in §2, in §3 I make the proposal that the lack of verbal *wh*-words is due to the fact that sentences with genuine interrogative pro-verbs would generate ill-formed logical forms for the C–I interface. In §4 I revisit the evidence for interrogative pro-verbs in the light of my proposal and in §5 I briefly address a prediction my proposal makes regarding the unavailability of other conceivable *wh*-words such as interrogative adpositions or tense markers.

## **2 A marked cross-linguistic option**

The lack of verbal *wh*-words is a cross-linguistically pervasive phenomenon to the point that Hagège (2003) questions "Whatted we to interrogative verbs?" as a way of expressing the typological rarity of them (see also Hagège 2008; Idiatov & van der Auwera 2004 for further typological analyses).<sup>2</sup>

In what is probably the broadest comparative analysis so far, Hagège (2008) studies a sample of 217 languages of which he only classifies 28 as having the property of displaying interrogative pro-verbs. He conjectures that this may be due to an economy restriction against morphologically unanalyzable forms:

This suggests that if interrogative verbs are found in so few languages, one of the reasons might be that most of them use an uneconomical device, by saying 'do what', for example, in a single unanalyzable unit, instead of using a succession of two very frequent elements, meaning, respectively, 'do' and 'what'. (Hagège 2008: 30)

I believe that this cannot be the reason for their scarcity, for otherwise decomposed *wh*-words (*what person* = who, *what place* = where, *what time* = when,

<sup>2</sup>Actually, in Basque (isolate), as in other languages, there is the morphological equivalent of Hagège's (2003) "whatted", *zertu*, which is composed of the indefinite/interrogative pronoun *zer* and the verbalizer suffix *-tu*. This verb, however, does not have the value of an interrogative pro-verb, but that of a "regular" pro-verb (that of avoiding to lexically express the nature of an eventuality, typically because of word retrieval difficulties, or because we want to avoid being too explicit about it (because of taboo or so)).

### Aritz Irurtzun

etc.) would be the norm across natural languages. And languages would resemble each other much more in this respect.<sup>3</sup> In the next section I will make an alternative proposal (a formal one) trying to account for this typological puzzle claiming that genuine interrogative pro-verbs (verbs asking about eventuality types) cannot exist because they would violate legibility constraints at the C–I interface.

# **3 A conjecture: Illegibility at the conceptual–intentional interface**

I would like to propose that the lack of verbal *wh*-words cross-linguistically derives from a legibility constraint at the interface between the linguistic computation and the language-external conceptual–intentional systems (by assumption, universal across the species). The idea is that C–I systems impose legibility wellformedness conditions on their possible inputs (namely, on the form of acceptable logical form representations) and the logical forms corresponding to sentences including genuine interrogative pro-verbs would violate those legibility constraints. Thus, if my hypothesis is correct, the general lack of verbal*wh*-words is an interesting fact about languages, but not a linguistic fact in essence (for it derives from conditions imposed onto language by language-external systems of thought).<sup>4</sup>

In particular, my proposal is that the lack of interrogative verbs derives from a general constraint on the logic of predication: predication amounts to logical assertion whereby a property is ascribed/attributed/applied to an object (cf. i.a. Engel 1989; Partee et al. 1990; McGinn 2000; Davidson 2005; Burge 2007; Liebesman 2015). That is, predicates predicate and it is therefore that predication qua interrogation is incongruent (not only in first-order logic).

(i) A constituent question is a question that asks for an instantiation of the variable in an "It is known that (possibly) happen/exist(… …)" structure.

According to their analysis, such a structure would be the presupposition that the situation under interrogation (possibly) exists, existed or will exist, and the variable *x* is formally expressed by an interrogative pro-word. They conjecture that only "endocentric phrasal" elements can be *wh*-words but such an analysis is also problematic, since it implies that all *wh*-words are phrasal, and that verbs are simple terminal elements, contrary to standard analyses of argument structure (see below).

4 See Chomsky (2005); Berwick et al. (2011); Roberts (2012); Biberauer & Roberts (2017) for discussion on the different factors affecting the design features of I-languages.

<sup>3</sup>Alternatively, Idiatov & van der Auwera (2004) hypothesize that *wh*-questions involve an existential presupposition such as (i):

### 14 Revisiting the lack of verbal *wh*-words

Furthermore, I shall argue that an "interrogation qua predication" would also derive into having logical form representations with DPs devoid of θ-roles (violating the θ-criterion, cf. Chomsky 1981 or Higginbotham 1985).

To begin with, it is essentially a truism that argument DPs function as participants in the eventuality denoted by the verb in a clause. Semanticists and philosophers of language have distinguished different types of participation (the literature talks about agents, themes, undergoers, experiencers, beneficiaries, etc. as the potential thematic- (or θ-) roles that a verbal argument can have) and the existence of some sort of θ-roles is virtually undisputed in linguistic theory, even if their conception and ontological status varies from one work to the other (see e.g. Carlson 1984; Dowty 1989; Parsons 1995). A more "syntacticising" view of θroles even proposes that θ-roles should be syntactically conceived as formal features, with a legibility requirement that those features be derivationally checked by logical form (LF) (see i.a. Bošković & Takahashi 1998; Hornstein 1999; Lasnik 1999; Manzini & Roussou 2000; Fanselow 2001; Bagchi 2007).<sup>5</sup>

In particular, θ-roles are central to neo-Davidsonian semantics, a conception of semantics deeply rooted in the philosophy of language that constitutes a natural partner of minimalist syntax (see Parsons 1990; 1995; Herburger 2000; Hornstein 2002; Pietroski 2002; 2003; 2005; Schein 2002; Irurtzun 2007; Lohndal 2014). In this framework, θ-roles function as the link between arguments and events in logical form. For instance, example (1) – repeated here as (11a) – would have the neo-Davidsonian logical form representation in (11b), which roughly reads as "there was an event that was a stabbing event that is past and whose agent was Brutus and whose patient was Cæsar":

	- b. ∃e [Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, Cæsar)]

The nature of each θ-role directly derives from the bottom-up syntactic composition of the clause, whereby DPs are merged in specific positions within the projection of event-denoting heads (see i.a. Pietroski 2003; 2005; Borer 2005; Ramchand 2008).

I would like to propose that the requirement for DPs to bear θ-roles derives precisely from the neo-Davidsonian logical form representation of sentences at the C–I interface: as shown in (11b) θ-roles relate individuals and eventualities

<sup>5</sup> In the P&P framework, the projection principle guaranteed all argument-structure restrictions to be set at D-Structure, but with the minimalist abandonment of internal levels of representation, an option opened for not all argument-structure relations to be set at first merge, therefore allowing for movement into θ-positions (see the references above).

### Aritz Irurtzun

and my proposal is that *wh*-words introduce variables that may range over individuals, as in (12a), for 'Who stabbed Cæsar?', or (12b), for 'Whom did Brutus stab?', or other elements like adjuncts (see below §5), but *not* over predicates of eventualities. As a matter of fact, predicating an interrogation is logically incongruent for predicates *assert/attribute* and interrogations *query* (12c):<sup>6</sup>

	- b. ∃e [Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, ?)]
	- c. \* ∃e [Agent(e, Brutus) & ?(e) & Past(e) & Patient(e, Cæsar)]

That is, the logical form in (12c) involves a predicate that questions its own essence, and this is incompatible with the essential function of a predicate: predicating (i.e. ascribing properties).

Furthermore – and this is important (see §4) – a logical form along the lines in (12c) would still be unwarranted. In fact, a predicate like *?(e)* crucially devoids the eventuality of any nature (it is completely undetermined), and as a consequence the DPs participating in the eventuality get no θ-role (given that θ-roles directly depend on the nature/structure of the eventuality at stake). In other words, in the absence of a specific semantic (and structural) specification for the verbal predicate of eventualities, its arguments will also be devoid of any θ-role, since θ-role assignment directly depends on the structure at the *v*P layer.<sup>7</sup>

<sup>7</sup> In particular, decompositional analyses such as Ramchand's (2008) propose that verbal predicates are phrases that may be composed by different heads (Initiationº, Processº, Resultº) ordered in the hierarchical embedding relation of sub-events and that the θ-role that a DP will get directly depends on the position where it was merged:

<sup>6</sup> For simplicity, I stick to this declarative type of logical form representation; see in Lohndal & Pietroski (2011) an approach to an "I-Semantics" for questions.

14 Revisiting the lack of verbal *wh*-words

Thus, rather than (12c), the consequence of having an "interrogation-cum-predication" would be along the lines in (13), where represents the unassigned θ-roles of the participants:

(13) \* ∃e [ (e, Brutus) & ?(e) & Past(e) & (e, Cæsar)]

Note that something like (13) is not a mere instance of structural ambiguity vis-á-vis the hearer; but an instance of *structural vagueness* and therefore, of ungrammaticality (cf. the θ-criterion). An underspecified representation such as (13) would generalize over all sorts of argument structures with different θ-role assignments; from *Brutus stabbed Cæsar*, to *Brutus liked Cæsar*, *Brutus had Cæsar*, *Brutus obtained Cæsar*, *Brutus created Cæsar*, *Brutus became Cæsar*, or *Brutus was Cæsar*.

Again, the way English (and many other languages) has to circumvent the lack of verbal *wh*-words is to employ a complex *do what* predicate that introduces a direct object and implies the assignment of an Agent θ-role to the subject. This, of course, results in a convergent logical form representation. In contrast, the logical form in (13) is critically underdetermined where *(e, Brutus/Cæsar)* may correspond to any theta role (agent, experiencer, possessor, …). In fact, there is no neat way of expressing such a meaning in plain English (which is precisely my point) but it would correspond to some higher-order description including metalinguistic terms along the lines already expressed in (8), here modified to (14):

(14) Meaning of (13): 'What type of eventuality happened such that it has Brutus as external argument (whatever the θ-role) and Cæsar as internal argument (whatever the θ-role)?'

The fact that the assignment of θ-roles depends on the structure of the sentence, and that different θ-roles depend on different syntactic configurations makes clear that questions such as (8) or (13) cannot exist in natural language. In a nutshell then, my proposal is the following one:

(15) *Proposal:* The lack of verbal question-words derives from the illegibility they would generate at the C–I interface, since their semantics involves predicating interrogations and a failure to assign θ-roles to event participants.

In the next section I revisit the cross-linguistic evidence for interrogative proverbs arguing that a large number of the "interrogative verbs" purported in the

### Aritz Irurtzun

literature do not question the type of eventuality itself, and the few cases that actually do so are loaded semantically, so that specific event structures and θroles (or macro-roles) are established.

## **4 Revisiting the evidence**

The hypothesis I just presented predicts the lack of *wh*-words that question the nature of an eventuality. However, note that it leaves room for verbal *wh*-words to exist, provided that they are semantically "loaded" (the type of eventuality they stand for is determinate and so are the θ-roles of their participants). In this section, I will argue that this prediction is borne out and that the few predicates questioning the nature of the eventuality that are found cross-linguistically are of this sort: they are not agnostic as to the type of eventuality which is at stake.

In this section, I review the evidence for interrogative verbs available crosslinguistically, arguing (i) that many of the alleged interrogative verbs are merely verbal forms employed in questions that do not question the type of eventuality at stake, (ii) that often, rather than atomic and unanalyzable, interrogative verbs are syntagmatic (of the *do what*-type), and (iii) that those languages that do have genuine interrogative verbs that question the type of eventuality involve a specific argument-structure (hence, they do not contradict the generalization in 9).

### **4.1 Not questions on the nature of the eventuality**

Besides the literature about interrogative verbs being scarce, often times it is contradictory in that different authors talk about phenomena of a very different nature. This is the case of verbs with "interrogative mood", which is a phenomenon that should be treated as completely separate from interrogative pro-verbs.

As an illustration, Kalaallisut (Eskimo-Aleut) is a language with "interrogative mood" verbs, but lacking genuine interrogative pro-verbs: Sadock (1984: 199) analyzes a set of verbal forms in Kalaallisut that appear in interrogative constructions, but as the description makes clear, rather than verbal question words, those are verbs with "interrogative mood", which is used in the formation of both alternative questions and question-word questions:8,9

<sup>8</sup>When discussing cross-linguistic examples, I provide the glosses as in the original sources cited. The only exception is Dyirbal (32–33), which does not have glosses on the original in Dixon (1972). The glosses I give for those examples are adapted from Hagège (2008).

<sup>9</sup> See also Hagège (2008) for further discussion of interrogative *naak* 'be where' and further arguments against considering Kalaallisut a language with interrogative pro-verbs.

14 Revisiting the lack of verbal *wh*-words


A similar pattern is attested in Nivkh (isolate; cf. Gruzdeva 1998; Nedjalkov & Otaina 2013). In this language a suffix like *-lo/-l* is attached to the finite verb in order to mark polarity questions:<sup>10</sup>

(19) Nivkh

If s/he p'rə-d̹ come-ind 'S/he came.'

(20) Nivkh

If s/he p'rə-lo/p'rə-l? come-q/come-q 'Did s/he come?'

Likewise, "interrogative verbs" in Ipai (Yuman; Langdon 1966), Maidu (Maiduan; Shipley 1964), Kwamera (Austronesian; Lindstrom & Lynch 1994) and many other languages, rather than pro-verbs over eventuality types, are just verbal forms restricted to polar question sentences.

So, what we observe in the interrogatives in these languages is not pro-verbs that stand for different types of eventualities but specific verbal forms (specific verbs or verbal particles) employed in interrogatives over participants, adjuncts, or the polarity of the clause, which is a completely different phenomenon.

<sup>10</sup>Examples from Nedjalkov & Otaina (2013: 116 and 137).

### Aritz Irurtzun

Besides, there are also languages like Lavukaleve (Central Solomons). This language is also said to be a language with interrogative verbs, but its interrogative predicates have a very specific semantics: rather than expressing queries over types of eventualities, they question the location of them. For instance, consider (21) and (22) where in the former the locative is expressed with an adjunct and in the latter with a verb (from Terrill 2003: 457 and 460):

(21) Lavukaleve

le but inu 2sg ria where ngoa stay me-m hab-sg.m inu 2sg 'But where do you live?'

(22) Lavukaleve me-kalam 2pl-father vasia-m be.where-sg.m 'Where is your(pl.) father?'

A similar thing happens in Puyuma (Austronesian), a language that has two verbal interrogatives *kuda* 'how' and *muama* 'why', but none of them questions the nature of the eventuality (Teng 2007). And actually, this is a very common pattern present in languages ranging from Makalero (Trans-New Guinea; see Huber 2011), to Wayuu (Arawakan; see Guerreiro et al. 2010), Atayal (Austronesian; see Huang 1996) and many other languages. What we see is that very often the purported interrogative verb of a language does not question the nature of the eventuality itself but its location, causes, etc. Thus, they do not contradict the generalization in (9).

### **4.2 Syntagmatic structure**

The nature of "interrogative verbs" in other languages is not very clear. For instance, Hagège (2008: 2) treats Mandarin *gànmà* in (23) as atomic, arguing that this makes it an interrogative verb. However, this is debatable: Luo (2016: 169) argues that at least in Tianjin Mandarin, *gànmà* is straightforwardly analyzable as composed of *gàn* 'do' and *mà* 'what', which, actually can appear freely and as a modifier, as in (23) and (24):

(23) Tianjin Mandarin ní 2sg zāi prog gàn do mà what ne? q 'What are you doing?'

14 Revisiting the lack of verbal *wh*-words

(24) Tianjin Mandarin mà what dier? place 'Where?'

But rather than a idiosyncrasy of Tianjin Mandarin, this is a more general pattern: a similar situation is found in Yongxin Gan (Sino-Tibetan), where *zū* 'do' and *guá* 'what' are merged into *zuá* 'do what' (Luo 2016: 170):

(25) Yongxin Gan

jin 2sg tɕh ei prog kie(taŋ) here tsua? do.what 'What are you doing here?'

Luo (2016: 170, 5n. 7) further notes that such a morpho-phonological merger

occurs only in the dialect spoken in the townships Wenzhu, Gaoxi, Longtian, and part of Shashi, not in the dialect spoken in the country town (Hechuan Township) and nearby, where 'do what' is more frequently pronounced as *tsu ga*, and *ga* 'what' is an (analyzable) object of the verb *tsu* 'do'.

And such a pattern is common in Sinitic languages (cf. e.g. Chongqing Mandarin *zuăzi* 'do what' from *zuo* 'do' and *sazi* 'what').

Besides, this is also the case of languages of different families and types such as Huallaga Quechua with *imana* 'do what' composed of *ima* 'what' and *na-* 'do' (Weber 1989), Wikchamni (Yokuts) with *hawit* composed of *ha* 'what' and *witi* 'say', 'do' (Gamble 1978), Mian (Trans-New Guinea) where *fatnà* 'do what' is probably composed of *fàb* 'where, what' and a finite verb form of *na* 'do' (see Fedden 2011), Chemehuevi (Uto-Aztecan) *hagani*, which is composed of the interrogative stem *haga* and the suffix *-ni* "most certainly relatable to *uni-* 'do'" (Press 1979: 89), or the Oceanic language Mavea, where *iseve* 'do what' seems to be composed of *sa* 'what', and *v̈e* 'make' (Guérin 2011: 312, fn. 46).<sup>11</sup>

Also, Udihe (Tungusic) has been analyzed as a language with an interrogative pro-verb, but the evidence of this language is not very clear: Nikolaeva & Tolskaya (2001: 352–353, 802) say that its pro-verb *ja-/i-* may occur with interrogative object pronouns, where it only means 'do'; see (26) and (27):

<sup>11</sup>Besides, other languages such as Baure (Arawak) resort to the nominalization of a dummy verb 'do' that can also be employed in declaratives meaning 'say' (Danielsen 2007).

### Aritz Irurtzun

(26) Udihe J'e-we what-acc ja:-i? prov.pst-2sg 'What were you doing?'

(27) Udihe Si you j'e-we what-acc ja-zaŋa-i? prov-fut-2sg

'What will you do?'

But it also may appear with a different nominal in reflexive accompanied by *ono* 'how' (28), or independently, meaning 'do what' (29):

(28) Udihe

Ono how ja:-i prov.pst-2sg.f mä:usa-i? gun-refl 'What did you do with your gun?'

(29) Udihe

Ono how ñixe-ze-mi do-sbjv-1sg bi me i:-te-mi-ne? prov-perm-1sg-cntr 'How shall I do (it), what shall I do?'

Furthermore it also has a non-interrogative indefinite use, as shown in (30):

(30) Udihe

Emiŋe mother sita-i child-1sg muñeli:-ni, sorry-3sg e-ini-de neg-3sg-foc olokto-won-o, cook-caus-ep e-ini-de neg-3sg-foc ja-wan-a. prov-caus-ep 'The mother feels sorry for her daughter, she does not force her to cook, she does not force her to do anything.'

All in all, we cannot conclude that these are genuine interrogative verbs.

### **4.3 Restricted syntax and loaded semantics**

Last, there are some languages that do seem to have interrogative verbs that ask about the event at stake, but I would like to argue that rather than being agnostic regarding the eventuality type, they presuppose specific argument structures and are, therefore, quite restricted in their use.

### 14 Revisiting the lack of verbal *wh*-words

For instance, Caviñena (Tacanan) has an interrogative verb *a(i) ju-* translated as 'do what', which is restricted to intransitive clauses (Guillaume 2008). And the same seems to be the case in Mapudungun (Araucanian) with interrogative verb *chum-* (de Augusta 1903; Smeets 2007), in Evenki (Tungusic) with *e:-* (Nedjalkov 1997), or in Mongolic Buryat *yaa-* (Skribnik 2003), Khalkha *yaa-* (Svantesson 2003), Kalmuck *yagh-* (Bläsing 2003), and Bonan *yangge-* (Hugjiltu 2003).<sup>12</sup> This is also the case of Melanesian Tinrin *trò*, which Osumi (1995: 229) describes as asking about "a subject's problematic situation" and where "something is wrong with the subject and the speaker is concerned about the matter. The subject cannot be in the first person" (Osumi 1995: 233), or in Wangkajunga (Pama-Nyungan) *wanjal-arri* (Jones 2011) or in Erromangan (Austronesian) *owo*, which "normally appears in a structurally minimal clause with no accompanying words" (Crowley 1998: 238), as in (31):<sup>13</sup>

(31) Erromangan Kem-awo? 2sg:prs-mr:do.what 'What are you doing?'

Other languages have different interrogative pro-verbs for intransitive and transitive predicates. This is the case, for instance, of languages like Dyirbal (Pama-Nyungan), with intransitive *wiyamay* and transitive *wiyamal* (Dixon 1972: 55):<sup>14</sup>

(i) Gumbaynggir ɟira-ŋ intr.vb-pst ŋiːnda 2sg.a gaːgal-a beach-loc 'What was wrong with you at the beach?' *or* 'What were you doing at the beach?'

<sup>12</sup>Among the Mongolic languages, Shira Yughur seems to be an exception in having two interrogative verbs: *yima-gi* 'to do what' and *yaa-gi* 'to do how' (Nugteren 2003). Other Mongolic languages such as Dagur, Ordos, Oirat, Moghol, Mongghul, Mangghuer, or Santa are not reported to have interrogative verbs (see the works in Janhunen 2003).

<sup>13</sup>Gumbaynggir (Pama-Nyungan) is analyzed by Eades (1979) as having just one interrogative verb that "is transitive and appears to mean 'do what?' or 'what's the matter?'" (Eades 1979: 302–303), but the example she gives (i) does not have any direct object, and neither the structure nor the interpretation of the construction is clearly transitive (also, the gloss she provides for the verb (intr.vb-pst) also suggests that it is really an intransitive verb):

<sup>14</sup>*wiyamay* loses its final *-y* before *-ɲ* in (32) and *wiyamal* loses its final *-l* before *-n* in (33). These verbs can also be used adverbially with a different interpretation.

### Aritz Irurtzun

(32) Dyirbal bayi cl.nom yaɽa man.nom wiyama-ɲu? do.what-ut.intr 'What was man doing?'

(33) Dyirbal ŋinda 2sg.erg bayi cl.nom yaɽa man.nom wiyama-n? do.what-ut.tr 'What did you do to man?'

A similar pattern is observed for instance in Vitu (Austronesian), with a distinction between *(ku)ziha* for intransitives, and *kuzihania/kuzingania* for transitives (van den Berg & Bachet 2006), in Kiribati (Austronesian) with *aera* (intransitive) *vs. iraana* (transitive) (Groves et al. 1985: 82), in Pitta-Pitta (Pama-Nyungan) with *min̪akuri* (intransitive) *vs. min̪akana* (transitive) (Blake 1979), or in languages such as Motuna (Papuan), where the interrogative verb *jeengo-* takes middle voice in intransitives and active voice in transitives (Onishi 1994) or in Martuthunira (Pama-Nyungan) where interrogative verbs are built upon the basis *whartu* 'what' by the addition of either the inchoative *-npa-*∅ or causative/factitative  *ma-L* (Dench 1994). And, actually, this is quite a common pattern, available from Chuckchee (Chukotko-Kamchatkan; Spencer 1999; Dunn 1999) or Kharia (Austroasiatic; Peterson 2010) to a wide range of Oceanic and Australian languages that employ voice or "valency augmenting" morphemes.

The only language in Hagège's (2008) typology that he classifies as allowing intransitive, transitive, and ditransitive constructions with interrogative verbs is Nêlêmwa (Austronesian), but the data discussed in Bril (2002; 2004) shows that the same verbal form cannot participate in any type of argument structure. In fact, the interrogative verb of Nêlêmwa is not a verb that questions the nature of the eventivity itself. It is a manner-questioning verb, thus similar to the patterns reviewed in §4.1. <sup>15</sup> What is more, Nêlêmwa – as is the case in many Oceanic languages – employs particular suffixes for augmenting the valency of a verb so that different verbal forms are associated to different argument structures and thematic relations. Thus, the form of the interrogative verb *kaamwa?* 'to do/proceed how', which apparently is employed in intransitive clauses and transitive clauses with a [−animate] object (34–35), is changed into *kaamwi?* in transitive constructions with a [+animate] direct object (36), and to *kaamwale?* in transitive constructions with a [−human] direct object and a specific reading of preparing something or proceeding to do something (37):<sup>16</sup>

<sup>15</sup>Nêlêmwa has at least two other interrogative verbs: *iva?* 'to be where' and *shuva* 'to be how', apparently both restricted to intransitive environments.

<sup>16</sup>All examples taken from Bril (2002: 50).

14 Revisiting the lack of verbal *wh*-words

(34) Nêlêmwa na 1sg kaamwa do.how bwat box hleny? this.dei 'What do I do with this box?'

(35) Nêlêmwa

na 1sg kaamwa do.how me depend na 1sg tami open bwat box hleny? this.dei 'How do I do to open this box?'

(36) Nêlêmwa

co 2sg u acc kaamwi do.how thaamwa woman hleny? this.dei 'What did you do to this woman?'

(37) Nêlêmwa

hâ 1pl.incl kaamwa-le do.how-tr nox-ena? fish-this.dei 'How do we prepare this fish?'

So, *kaaamwa* does not question the nature of the eventuality itself and furthermore, we see that the verb changes with the argument structure.

This is also something we can observe in Formosan languages like Kavalan (Austronesian; Lin 2012: 186). In this language the interrogative verb *quni* can get different readings ('do what'; 'do how'; 'go where') in different environments: in (38) it gets the 'go where' reading in an intransitive construction (where the subject gets the θ-role of a theme), and in (39) it gets the 'do what' reading associated to an agent subject but, crucially, there the verb is marked with the agent voice (av) marker:

(38) Kavalan quni=pa=isu? go.where=fut=2sg.abs 'Where are you going?'

(39) Kavalan q〈um〉uni=isu 〈av〉do.what=2sg.abs tangi? just.now 'What were you doing just now?'

And a similar thing happens in Amis (Austronesian), where *maan* 'what' can be employed as a verb with voice markers (*ma-, mi-, -en*, etc.) co-varying with the argument structure (Lin 2012: 192):

### Aritz Irurtzun

(40) Amis Ma-maan av-what.happen cingra? 2sg.abs 'What happened to him?'

(41) Amis

Mi-maan av-do.what ci-Panay? ncm-pn 'What is Panay doing?

(42) Amis

Na pst maan-en do.what-pv isu 2sg.erg ku-ra abs-that wacu? dog 'What did you do to that dog?'

I shall conclude from this that when verbs question the type of eventuality, they tend to do so within a restricted set of options sharing an essential argument structure.<sup>17</sup> This means that when a given language allows a question such as (43a), its logical form will not be of the type in (43b), roughly, "What type of eventuality are you participating in such that you are experiencing it or undergoing it or performing it or initializing it, etc?" but the more precise (43c), roughly, "What are you doing?":

(43) a. *Whxyzing* you?

b. \* ∃e [ (e, you) & ?(e) & Present(e)]

c. ∃e [Agent(e, you) & Action(e, ?) & Present(e)]

Likewise, rather than the structurally vague (44b), a question such as (44a) (=8) will have a logical form along the lines in (44c); roughly, "What type of action did Brutus do to Cæsar?":

(44) a. *Whxyzed* Brutus Cæsar?

b. \* ∃e [ (e, Brutus) & ?(e) & Past(e) & (e, Cæsar]

c. ∃e [Agent(e, Brutus) & Action(e, ?) & Past(e) & Theme(e, Cæsar)]

Again, note that this is not a matter of informativity of the question: there is nothing wrong informationally with a question with higher order grammatical terms such as "What type of eventuality happened such that it has Brutus

<sup>17</sup>The fact that in many languages interrogative verbs are morphologically related to indefinite and deictic elements (cf. Hagège 2008) also supports the idea that these verbs imply a large semantic/discursive load.

### 14 Revisiting the lack of verbal *wh*-words

as external argument and Cæsar as internal argument?". It is just not natural language.

This state of affairs contrasts sharply with the case of non-interrogative proverbs like the aforementioned Basque *zertu* (cf. footnote 2), which are relatively abundant cross-linguistically. Non-interrogative pro-verbs are typically employed when encountering difficulties with word retrieval, i.e. in situations where the speaker construes a determinate argument structure (with a proper θ-role assignment, etc.) but fails to retrieve the PF exponent of the corresponding verb.

## **5 A further prediction: Interrogative adpositions?**

The analysis proposed in §3 is based on the idea that natural language cannot question about predicates of eventualities because that would generate ill-formed representations for the C–I interface. Now, this makes a further prediction: the impossibility should be extensible to other analogous constructions whose semantic contribution is the introduction of a predicate of eventualities. I think that this is the case, as shown by the apparent cross-linguistic lack of interrogative adpositions, for instance.

What is the semantic contribution of an adposition? Davidson (1967) originally proposed that a sentence like (45a) should be characterized as having the logical form in (45b), with *to* introducing a predicate of events that is conjoined to the denotation of the verb:<sup>18</sup>

	- b. ∃e[flying(I, my spaceship, e) & to(the morning star, e) & Past(e)]

But as argued by Larson & Segal (1995), this seems to imply that the event *e* stands in the 'to' relation to the morning star; which is quite obscure. Likewise, sentence (46a) with a neo-Davidsonian logical form along the lines in (46b) would imply that there exists some kind of "with-a-knife" event, again not very sensible:

	- b. ∃e[Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, Cæsar) & witha-knife(e)]

<sup>18</sup>Davidson (1967) uses triadic event predicates such as *flying(I, my spaceship, e)* with an "extra argument" for the event variable for transitive verbs. The neo-Davidsonian trend since Castañeda (1967) on the other hand advocates for separation of the arguments from the semantic contribution of the verb and their introduction via predicate conjunction. In this example, I stick to the original Davidsonian formulation.

### Aritz Irurtzun

Therefore, Larson & Segal (1995) propose to see prepositions such as *to* and *with* as expressing roles that can be played by participants in eventualities. For instance, *with* in (46a) expresses the Instrument through which an action is accomplished, therefore, they argue that its logical form representation should be along the lines in (47):

(47) ∃e[Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, Cæsar) & Instrument(e, a-knife)]

This would be the general semantic contribution of adjuncts, which can introduce different roles such as Goals, Sources, Experiencers, etc. We can immediately see that this move paves the way for an explanation of why there are no adpositional *wh*-words cross-linguistically: just like an interrogative verb would create a C–I illegibility, the same will happen with an interrogative adposition.

As an illustration, an imaginary example of an interrogative adposition would be along the lines in (48a), with the interrogative preposition *whxyz*, and its corresponding logical form in (48b):

	- b. \* ∃e[Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, Cæsar) & ?(e, a-knife)]

Again, it is difficult to express in plain English what something like (48a) is intended to mean (again, this is my point), but it should be understood as questioning an overarching question about the role and/or the relation and/or the place, etc. of the knife within the stabbing of Cæsar by Brutus. Its ungrammaticality, however, contrasts with the perfect grammaticality of a natural question on an adjunct like (49a), with its corresponding logical form in (49b):

	- b. ∃e[Agent(e, Brutus) & Stabbing(e) & Past(e) & Patient(e, Cæsar) & Instrument(e, ?)]

Example (49a) is perfectly grammatical, since it expresses a question over a variable; example (48a) on the other hand is a question qua predication, and it is as such incongruent.

In a nutshell then, the hypothesis presented in §3 also allows to account for the lack of adpositional *wh*-words and it is also extensible to other cross-linguistic lacunæ, like the lack of interrogative tense markers, modalities, etc.

14 Revisiting the lack of verbal *wh*-words

## **6 Conclusions**

In recent years, theoretical (bio-)linguistics has identified a range of different factors affecting the shape of I-languages (Chomsky 2005; Berwick et al. 2011; Roberts 2012 for discussion). The idea that I proposed in this article is that a part of universal properties of natural languages may be due to legibility conditions imposed by language external components. I believe that by researching the nature and constraints of such components we can gain further understanding of the limits and patterns of cross-linguistic variability.

# **Abbreviations**


# **Acknowledgements**

Many thanks to the editors (specially to András, who had to deal with the LATEX typesetting!), and to two anonymous reviewers and to Maia Duguine, Urtzi Etxe-

### Aritz Irurtzun

berria, Ricardo Etxepare, Nerea Madariaga and the audience at the HiTT Linguistics Seminar (University of the Basque Country UPV/EHU) for their comments. This work benefited from the projects BIM (ANR), UV2 (ANR-DFG), IT769-13 (Eusko Jaurlaritza), and PGC2018-096870-B-I00 and FFI2017-87140-C4-1-P (MINE-CO). The research leading to these results has also received funding from the European Union's seventh framework programme for research, technological development and demonstration under grant agreement n°613465 (AThEME).

## **References**


14 Revisiting the lack of verbal *wh*-words


### Aritz Irurtzun


14 Revisiting the lack of verbal *wh*-words


### Aritz Irurtzun


# **Chapter 15**

# **Past/passive participles and locality of attachment**

# Alison Biggs

Georgetown University

In this short chapter I outline some properties of the structure "I'm done writing Chapter 3", which does not appear to have been formally analysed before. Concentrating on the *-en*/*-ed* participle and the structure's semantics, I suggest that this is a kind of stative passive, of a kind not previously known. I offer a syntactic analysis, in which an aspectual projection can stativize the eventive syntax it hierarchically embeds.

# **1 Introduction**

English is traditionally described as having three participles of the same form: the stative passive, verbal passive, and perfect (1).

	- b. The letters were written by her.
	- c. She has written a letter.

Establishing points of difference and commonality in the syntax and interpretation of the structures in (1) has played a central role in the development of theories of syntax and word formation.

A particular pattern of interest for statives has been whether the states they describe follow from a prior event. In this, "resulting state" statives (2a), which do follow from a prior event, can be distinguished from "pure" statives (2b), which lack event implications altogether (Parsons 1990; Embick 2004).

### Alison Biggs


It is often observed that in modern English, the participle (potentially) has a resulting state interpretation (as opposed to any other kind of state) only where the past/passive morpheme *-ed/-en* attaches to the item that describes the event from which the state results (e.g. Parsons 1990; Kratzer 2000; Alexiadou & Anagnostopoulou 2008; Alexiadou et al. 2015).

As illustration, in (3a) *-ed/-en* attaches to the main verb, and a surface subject is interpreted as being in a state that results from a (writing) event. In contrast in (3b), *-ed/-en* attaches to non-main verb *be*, with the present/active form *-ing* attaching to the main verb, and does not describe a resulting state.

	- b. She has been writing Chapter 3 for days.<sup>1</sup>

The contrast in (3) can be captured by some version of (4):

(4) A resulting state interpretation requires an embedded lexical predicate in past/passive participle form.

The structure in (5) ('*be done* VP-ing') seems to present an exception to this generalization. (5) can describe the object as in a state resulting from the (writing Chapter 3) event, but the past/passive morphology attaches to *do*, with the embedded verb form a present/active participle.

(5) She is done writing Chapter 3.

As far as I can tell the structure in (5) has not been analysed before, and I label it the *done*-state. (5) has some unusual properties: for example, morphosemantically it is a stative passive; syntactically, however, (5) is transitive and active, in the sense that it licenses a direct object. The key point to be investigated in this paper is that (5) describes a resulting state, even though the past/passive affix attaches to the embedding item *do*, apparently violating (4).

The paper is structured as follows. Section 2 makes precise that the "resulting state" interpretation of the *done*-state can be a target state. Section 3 discusses the structure of the *done*-state, highlighting some implications for previous analyses of target state participles. Section 4 discusses and rejects an alternative perfect analysis. Section 5 concludes.

<sup>1</sup>The present/active is often analysed as a state in temporal semantic terms (e.g. Parsons 1990). Temporal semantic states are not usually analysed in the same way as resulting states of the kind of interest here.

15 Past/passive participles and locality of attachment

## **2 The interpretation of the** *done***-state**

"States" form a heterogeneous class (see especially Kratzer 2000). Of interest in this chapter are target states.

Target states are those which describe a temporary or reversible state, i.e. the state held by the surface subject of (6a) (Parsons 1990; Kratzer 2000); these are interpreted as being characteristic of or resulting from the prior event. Target states are typically contrasted with resultant states, which simply describe the post-state of an event; this post-state is interpreted as holding forever after the prior event, e.g. the state held by the subject of (6b).


One surprising interpretation of the *done*-state is a target state of the direct object. The target state of the *done* structure in (7a,b) is the state resulting from the event described by the embedded VP. An important point I will not address here is that the stateholder subject of the *done*-state is also interpreted as the agent of the embedded VP.

	- b. She's done writing Chapter 3.

Target and resultant states describe states that follow a prior event, but differ in the characterisation of the prior event.<sup>2</sup> Target states refer to states that describe results of events, and the result is understood as ongoing at the time of reference or evaluation (Kratzer 2000), an effect known as "current relevance". Current relevance can be demonstrated by certain kinds of modifiers, which are licit with target state interpretation only if the adverb can be construed as modifying a result of the state. It is said to follow that target states are not possible with adverbs of quantity or cardinality (Mittwoch 2008) (8a,b). (Ungrammaticality refers to the target state interpretation).


<sup>2</sup>Target states always entail a resultant state reading, e.g. (6a). As such the *done*-state also has a resultant state reading.

### Alison Biggs

As the resultant state describes the post-state of an event, the event may be over by the time of reference or evaluation, and the state does not require current relevance. Lack of current relevance (despite present tense) is illustrated by the perfect in (6b) and adjectival passive in (9a). (9b) illustrates that quantity/cardinality adverbs can modify resultant states.

	- b. The windows are closed three times each evening.

The target state interpretation is also clearly distinct from a second interpretation of the *done*-state that I call the "cessation" or "termination" reading, in which the surface subject is interpreted as having ceased or terminated engagement in the activity described by the embedded verb. The cessation reading of (10) is simply that Maria is no longer *writing Chapter 3*, i.e. it relates to her agency rather than her (resulting) state. Cessation is therefore clearly different from the target state that results from the embedded VP.<sup>3</sup> For reasons of space I leave to future work whether the cessation interpretation derives from the same structure as that of the target state.

(10) Maria's done writing Chapter 3 for the moment – she has to run more subjects before writing more.

One reason to analyze *done* as a stative participle is that it only occurs with the auxiliary *be*.

(11) \* I've done baking the cake.<sup>4</sup>

This makes *done* unlike aspectual predicates, which can appear with the auxiliary *have*.

<sup>3</sup>Cessation bears a superficial similarity to the *done-with* construction (*I'm done with baking cakes*), a structure which also, to the best of my knowledge, has not been analysed before. Like the *done*-state, *done-with* is morpho-semantically a stative passive; however, *done-with* is syntactically intransitive, while the *done result* is transitive. The PP in *done-with* presumably has a nominal complement. There are many syntactic and semantic differences between the constructions, but for reasons of space I will point out just one: *done-with* requires an agentive surface subject, while the *done*-result does not: *The water is done (\*with) boiling/The machine is done (\*with) washing that load.*

<sup>4</sup>A reviewer accepts *have* in (11), and highlights that Google returns attested examples. I found: *I've done watching the 6 seasons, I have watched the movie countless time* [sic.], *I've done reading the book.*, retrieved 10/11/2017, http://sachzca.blogspot.co.uk/2008/11/. This is ungrammatical for all speakers I consulted, but, judging from context, the *have* variant does not seem to have a target state reading, so I have left the asterisk in the main text. The observation of variation clearly requires further investigation.

15 Past/passive participles and locality of attachment

	- b. I've finished/stopped writing Chapter 3.

With the stative *be* (13a), on the other hand, *finish* also has the target state interpretation. Some speakers also accept *be finish*-VP-ing (13b) (although not most of the British speakers I consulted, including myself), apparently again with the target state interpretation.

	- b. % I'm finished baking the cake.

Pending further investigation, I take the auxiliary *be* to be indicative of the structure that derives the target state interpretation.

## **3 The structure of the** *done***-state**

Different stative interpretations (such as the difference between target and resultant states) are known to be built in different ways.

Target states are classically characterised by their having both an event and (target) state argument (Kratzer 2000):

(14) λsλe [ cool(e) ∧ event(e) ∧ cooled(the soup)(s) ∧ cause(s)(e) ] 'The soup is cooled' (Kratzer 2000: 391)

Comparative investigation of the syntax of target state participles has shown that this interpretation derives from a syntactic configuration where a stativizer (labelled Asp) attaches to an eventive component (for example, verbalising *v*, or Root) (Alexiadou & Anagnostopoulou 2008; Embick 2009; Anagnostopoulou & Samioti 2013; Alexiadou et al. 2015).<sup>5</sup>

(15) Target states: *v* attachment of Asp

<sup>5</sup> (15) essentially derives the relevant aspect of the generalization in (4), that the past/passive morpheme attach directly to the lexical predicate: for target states, this can be regarded as a reflex of the local attachment of the aspectual and eventive components in the verbal structure.

### Alison Biggs

Abstracting over (14) and (15), target states have a structure defined by a local relation between an event and a stativizer (Kratzer 2000; Alexiadou & Anagnostopoulou 2008; Embick 2009):

(16) [ event, stative ] → target state interpretation

At first blush, the *done*-state seems to present an exception to (16), given that in the *done*-state the stative *(be) done* clearly embeds the eventive VP. Evidence that *done* has the stativizing aspectual function is confirmed by the pair in (17), which show that while the present/active is aspectually unbounded or ongoing (17a), the structure with *done* has a result state (17b).

(17) a. I'm writing Chapter 3.

b. I'm done writing Chapter 3.

However, closer analysis of the *done*-state structure indicates that the generalisation in (16) can be retained.

I propose that the stativizer (*-en*) attaches to a semantically vacuous *v*, and it is this local attachment that is responsible for deriving the target state, in line with (15 and (16).

(18) [ *v*-stative [ event ] ] → target state interpretation in the *done*-state

This *v* is realized as *do*. As such, *do* is a dummy item, a form of *do*-insertion that supports the aspectual morpheme. Dummy *do* can similarly appear in the participial form *done* (rather than *do, did, does*, etc.) in the British varieties of English that allow *do* to appear at the edge of a VP-ellipsis site following a modal or auxiliary thanks to Dave Embick (p.c.) for this point.

	- b. Have you looked up the scores yet? I haven't done, but will do.

The intuition is then that because the eventive item that the stativizer attaches to is semantically vacuous, the event that *v*vacuous describes is anaphoric with that described by the embedded VP. This vacuity means that the *done*-state only describes one prior eventuality, and not two: (20) says that there was only a *cutting* event, for example.

(20) I'm done cutting his hair.

### 15 Past/passive participles and locality of attachment

For reasons of space I cannot address whether participial forms of *do* are eventive when they lack the VP complement (i.e., *She's done*); for observations that it may not (at least syntactically) see Fruehwald & Myler (2015) in connection with the dialectal form *I'm done my homework*.

Although on this account the target state itself is created by the local eventstate relation, the non-local relation between the stativizer and the VP event makes a prediction with respect to possible target state interpretations. It has often been observed that local attachment in (16) restricts sets of possible interpretations in a way that non-local attachment does not (in the context of participles, see especially Anagnostopoulou & Samioti (2013), and references there). In particular, under local attachment of Asp to the eventive component, root meaning interacts with Asp so that a Root that is not typically a good property of states does not easily appear in the target state structure without significant context or coercion; Kratzer (2000) and Embick (2009) give a range of examples of this of the type in (21). Embick (2004) suggests the target state reading of *kicked* can be coerced with a factory scenario where all of the tyres have to be kicked before employees can leave; a similar factory scenario can improve a target state interpretation of hammered nails.

	- b. ? These nails are hammered.

As the relation between the target state component and the (lexical) event in (18) is non-local, Asp and eventive *v* (or Root) should not exhibit such restrictions, and *done* should create a target state even with those verbs that do not easily form target state interpretations via direct attachment. This prediction is borne out. A target state reading is readily available with *kick* and *hammer* under *done* (22), even in out of the blue contexts.

	- b. I'm done hammering the nails.

In sum, given the findings of the previous Section, I propose the structure of the *done*-state is as in (23).

### Alison Biggs

(23) The *done*-state structure: *I am done writing Chapter 3.*

The auxiliary *be* is in T, and T takes a stativizing projection, AspP<sup>1</sup> as its complement. This "top part" of the structure lacks an argument introducing projection, such as Voice. This "top part" is, in effect, a stative passive.

A second aspectual projection is realised as the present/active morphological form. In the lower component of (23), an active VoiceP has a transitive syntax, introducing an argument in its specifier, and valuing Case on an internal argument. It is the argument in the specifier of Voice that is the agent of the embedded event; this argument is proposed to be PRO. The surface subject is then interpreted as both the agent and state holder of the clause via a control relation.

The remainder of this chapter briefly discusses a possible alternative analysis of the *done*-state.

## **4 Against a perfect syntax**

An alternative analysis of the *done*-state might draw a comparison with the English perfect.

Such *be*-perfects are found in Bulgarian, where a (resultative) perfect can be expressed with the perfective participle:

15 Past/passive participles and locality of attachment

(24) Bulgarian Ivan Ivan e be-3sg.prs postroil build-prf.m.sg pjasâčna sand kula. castle 'Ivan has been building a sandcastle.' (Pancheva 2003: 296)

Perhaps, then, the *done*-state has a syntactic structure like (25), with *done* a marker of perfectivity.

(25) A present perfect: *I am done writing Chapter 3.* (To be rejected)

Syntactically, though, the *done*-state has different syntactic properties to the English *have*-perfect. Building on tests discussed in Fruehwald & Myler (2015), the *done*-state is ok with *all* modification (26a), just like other stative passives (26b). The perfect is ungrammatical with *all* modification.

	- b. I'm all ready.
	- c. \* I've all washed the dishes.

Second, the *done*-state can appear in a reduced relative clause, while the perfect of a transitive cannot.

### Alison Biggs

	- b. \* Would all the students signed the petition please leave?
	- c. Would all the students who have signed the petition please leave?

I do not pursue a perfect analysis further; see Fruehwald & Myler (2015) for a similar conclusion for other forms of *be done* based on study of the dialectal *done my homework* construction.<sup>6</sup>

## **5 Summary**

An extensive body of work has shown that a target state interpretation derives from a structure in which an eventive and stative component are in a local syntactic relationship. This paper investigated an apparent counter-example to this analysis. It showed that statives of the form *I'm done VP-ing* (*She's done writing Chapter 3*) have a target state interpretation. However, in this structure the stativizing past/passive morpheme attaches to *do*, so that it is in a non-local configuration with the event described by an embedded verb phrase, the event from which the target state is interpreted as resulting.

I argued that the target state interpretation of the *done*-state is nonetheless derived via a local relation between a state and eventive component, as in previous work. However, in the *done*-state, the eventive component that the stativizer attaches to is semantically vacuous, so that the prior event from which the target state follows is understood to be that of the embedded VP. The non-local relation between the stativizer and eventive VP component permitted regular derivation of target state interpretations of events out of which target states are not typically possible. Further research is needed to address the general challenge of determining how the target state of the event is accessed by the stativizer, whether in a local or non-local configuration.

<sup>6</sup>The *I'm done my homework* construction (DMH), found across Philadelphia, Canada, and Scotland, can also be syntactically and semantically distinguished from *done*-state structures. Fruehwald & Myler (2015) show at length that the state described by DMH does not come about as a result of a semantically or syntactically identifiable prior event (Fruehwald & Myler 2015: 154–157) (thanks too to Meredith Tamminga and David Wilson for discussion). As such, Fruehwald & Myler (2015) analyse the DMH structure as a complex aP *done* (which does not a have a VP component), an aP that Case licenses an NP complement in a "transitive adjectival passive" configuration.

Despite the syntactic and semantic differences between DMH and *done*-VP-ing, Fruehwald & Myler (2015) make the intriguing observation that the availability of DMH across varieties of English correlates with also having the form *X-en*-VP-ing. Some (Montreal) speakers, for example, have DMH with *start* (*I'm started NP*), and this seems to correlate with also having *I'm started VP-ing*, ungrammatical in most varieties of UK and US English. I leave examination of possible structural parallels between the two constructions to future work.

15 Past/passive participles and locality of attachment

## **Abbreviations**


# **Acknowledgements**

Thanks to Ian Roberts for many conversations about passive structures. The work discussed in this chapter is an offshoot of a collaborative project on the *I'm done my homework* construction with Meredith Tamminga; particular thanks are due to her and to Dave Embick for very helpful discussion. Thanks to reviewers whose suggestions greatly improved exposition. Any errors are mine.

## **References**


### Alison Biggs


# **Chapter 16**

# **Functional items, lexical information, and telicity: A parameter hierarchy-based approach to the telicity parameter**

# Xuhui Hu

Peking University

This paper attempts to present an account for the parameters of telicity based on data from Yixing Chinese, a variety of Chinese Wu dialect, as well as well-studied languages like English and Slavic languages. It is argued that the cross-linguistic variation of telicity is reduced to two factors of a lexicon: whether a language has a functional item bearing a telic (quantity) feature, and whether the telic functional item also bears extra semantic information entailing the measuring up point of the event. These two factors determine the following properties of a language: in English and other Germanic languages, without a telic functional item, telicity often relies on quantity objects, and a quantity object often forces telic interpretation to be derived; in Slavic languages and Chinese (Mandarin, Yixing and perhaps other dialects) wherein telic functional items are available, telicity does not rely on quantity objects but on the functional item, which imposes quantification over bare nominals. Slavic languages differ from Chinese in that their telic items also bear semantic information entailing the measuring up point of an event, so the endpoint of a telic event is invariably identified with its measuring up point, while such information is only a piece of cancellable default meaning in Chinese.

# **1 Introduction**

This paper studies the syntactic variation of inner aspect, which is concerned with the internal temporal structure of an event, as opposed to outer aspect

Xuhui Hu. 2020. Functional items, lexical information, and telicity: A parameter hierarchy-based approach to the telicity parameter. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 329–355. Berlin: Language Science Press. DOI: 10.5281/ zenodo.3972864

### Xuhui Hu

(Travis 1991: 7) that denotes the speaker's point of view over the event (Smith 1997). While outer aspect is uniformly taken as a syntactic object realised by a functional head (Asp head) in the syntactic tree of the Chomskyan tradition, ever since Vendler's (1957) work, inner aspect has been widely taken as part of lexical information, characterised by the classification of the accomplishment, achievement, activity, and state predicates. However, recently, researchers like Borer (2005a,b; 2013), MacDonald (2008) and Travis (2010) come to the conclusion that inner aspect, like outer aspect, is also an interpretation derived from syntactic computation, instead of being a piece of lexical information. In this paper, drawing on new data from a Chinese dialect, Yixing, I take up this assumption, especially that of Borer (2005a,b; 2013), to further investigate the underlying mechanism leading to the cross-linguistic variation of inner aspect.

Variation concerning inner aspect, especially telicity, has been widely discussed in Filip (1997; 2000), Filip & Rothstein (2000), Borer (2005a,b), MacDonald (2008), and Travis (2010), among many others. This paper, drawing upon data from Yixing Chinese described in Hu (2016), places Chinese within the broad picture of comparative study on telicity, and shows that variation of telicity hinges upon the (un)availability of a functional item bearing a telic (quantity) feature in the lexicon, and further variation will arise due to extra semantic flavours of the functional item. This paper, therefore, not only contributes new data to the debate on the nature of telicity, but also provides a new account for the variation of telicity in the manner of hierarchy of parameters proposed in Roberts (2010).

The rest of this paper is organised as follows. §2 presents a sketchy introduction to telicity related notions and issues in English and Slavic languages that have been covered in the recent study of telicity and its variation. §3 will present a summary of two approaches to the cross-linguistic variation of telicity, and §4, with the presentation of the data from Yixing Chinese, brings Chinese into the picture of the telicity variation. Based on the data and framework outlined in the previous sections, in §5 I explain the underlying mechanisms that govern the variation on telicity, and work out a hierarchy of parameters of telicity. §6 concludes the paper.

## **2 Inner aspect and variation: The facts**

### **2.1 Inner aspect: A short introduction**

Inner aspect, also termed as *aktionsart* and *lexical aspect*, is not about how the language user views an event, but about the internal structure, temporal structure in particular, of an event. Whether an event is expressed as having an endpoint is

### 16 Functional items, lexical information, and telicity

at the centre: an event with an endpoint is assumed to be telic, otherwise atelic. It should be noted that telicity is not about the reality, but is a piece of information expressed linguistically:

	- b. John ate apples for 10 minutes.

(1a) is telic: the endpoint of the eating event was the point when the last bit of the apple was consumed. (1b) is atelic, as the endpoint is not expressed linguistically – we only know from the sentence that within 10 minutes, John had been eating apples. While in reality there will be an endpoint of the event of John's eating apples, this information is not expressed by the sentence.

If we only take English data to explore the nature of inner aspect, two factors are at stake in determining telicity. The first concerns the verb types in terms of Vendler's (1957) classification. Telicity in English often goes with achievement and accomplishment predicates, and it is quite hard to express a telic event if the predicate is of the activity or state type (but see later that under certain circumstances, telicity will also arise with such predicates).

	- b. John drank a bottle of beer in 10 minutes.
	- c. John pushed the cart for/\*in 10 minutes.
	- d. Mary stayed in London for/\*in 10 days.

In the above examples, *reach* is an achievement predicate which denotes a change of state at a single temporal point – the initial and the final points share the same point, i.e. *9 pm* in (2a). *drink* is an accomplishment predicate which in (2b) denotes an event that spans a period of time: the initial point was when John began to drink the beer while the endpoint was when the final drop of the beer was consumed. In (2c) and (2d), no endpoint is expressed, which is confirmed by the incompatibility with the *in x time* adverbial, a standard diagnostic of telicity. Following Dowty (1991) and Rothstein (2004), the predicates that allow for telicity take the internal argument as an "incremental theme", which is an argument that seems to measure up the event, representing a homomorphic mapping between the argument and the event. For example, *a glass of beer* is an incremental theme in (2b): there is a one-to-one homomorphic mapping between the glass of wine and the drinking event: the consumption of the last drop of the wine signals the endpoint of the drinking event. It is in this sense that predicates like *drink* are

### Xuhui Hu

termed as homomorphic predicates (Krifka 1992; 1998; Filip 1997), which include both accomplishment and achievement predicates.

The second factor concerns the internal argument. An accomplishment or achievement predicate does not guarantee the telicity of an event: often a quantised or quantity object (Krifka 1992; 1998; Borer 2005a,b) is needed. Consider the following examples:

	- b. John built houses \*in 2 months.

In addition to the aforementioned factors, sometimes a directional PP can also affect telicity: while an activity predicate normally does not allow for telic interpretation, the addition of a directional PP can contribute to the telic interpretation:

	- b. John pushed the cart *to the wall* in 10 minutes.

It is clear that it is the PP *to the wall* that makes the telic interpretation legitimate. Without taking any theoretical stance for now, we can say that the function of PP is to provide an endpoint to the pushing event.

Relying on English data, we can draw a conclusion that telicity connects with multiple facets: the predicate type (at the level of verbal head, or simply a matter of lexical information), the quantity of nominal objects (NP or DP level, definitely not a matter of lexical information), and the function of the directional PP (VP level). Whatever approach we take, one point is for sure: telicity is by no means a matter solely confined to the domain of lexical information.

Another conclusion derived from English data is that telicity (or inner aspect) in English, unlike outer aspect, is not represented by morphological marking: there is no grammatical marker to yield telicity; but outer aspect clearly represents morphological marking – the progressive aspect is reflected by the *ing* marking on the verb, for example.

In the next section, I will show that in some languages, telicity of an event is not determined equally by the aforementioned factors; moreover, there is morphological marking directly related to telicity. The existence of such phenomena makes the variation of telicity an interesting research topic, which constitutes the central topic of this paper.

16 Functional items, lexical information, and telicity

### **2.2 Telicity in Slavic languages**

Slavic languages are often taken as the major source showing the variation of telicity (but see Travis 2010 for more languages). Two points of Slavic languages are at stake. First, when a telic event is expressed, a perfective prefix is attached to the verb. The following Russian examples exhibit this point:

	- a. Ja I vypil drank-pfv butylku a-bottle vina of-wine za in čas hour / \*v tečeniji during časa. hour 'I drank a bottle of wine in an hour / \*for an hour.'
	- b. Mary Mary pročital read-pfv knigu a-book za in čas hour / \*v tečeniji during časa. hour 'Mary read a book/poetry in an hour / \*for an hour.'

Recall in English, the existence of an accomplishment predicate and a quantity object can give rise to a telic event; in Russian, however, without a perfective prefix, telicity cannot be yielded:

	- a. Ja I pil drank-ipfv butylku a-bottle vina of-wine \*za in čas hour / v tečeniji during časa. hour 'I drank a bottle of wine \*in an hour / for an hour.'
	- b. Mary Mary čitala read-ipfv knigu a book \*za in čas hour / v tečeniji during časa. hour 'Mary read a book \*in an hour / for an hour.'

While a directional PP can turn an activity event into a telic one, without a perfective affix, telic interpretation is just impossible in Russian:

	- a. Fermer The farmer tasčil dragged-ipfv brevno the log v into ambar the barn \*za in čas hour / v tečeniji during časa. hour

'The farmer dragged the log into the barn \*in an hour / for an hour.'

b. Ptisi The birds leteli flew-ipfv k toward kletke their cage \*za in čas hour / v tečeniji during časa. hour 'The birds flew toward their cage \*in an hour / for an hour.'

### Xuhui Hu

While there are more telicity related properties in Slavic languages which I will introduce in the course of discussion, now we can already see some aspects of variation of telicity: in Slavic languages, telicity is morphologically realised, thus unlike English which only relies on the quantity theme and the predicate type (and sometimes directional PPs). This variation provides clues as to the nature of telicity, and presents specific issues for the investigation of the mechanism underlying the variation of telicity.

## **3 Approaches to variation of telicity**

### **3.1 The lexicalist approach**

Abstracting away technical details, two strands of analysis are proposed on the variation of telicity, one being a lexicalist approach (Filip 2005; Filip & Rothstein 2000) and the other syntactic type (Borer 2005a,b; MacDonald 2008; Travis 2010). In this section, I present a brief summary of the lexicalist approach. The lexicalist approach to telicity is characterised by the central assumption that telic reading is derived not via the valuation of a feature specified on a functional head, but from the lexical information of the predicate.

According to Filip (2005); Filip & Rothstein (2000), telicity arises because a maximalisation operator Max<sup>E</sup> applies at the denotation of the predicate of an event. This operator maps sets of events denoted by the predicate onto sets of maximal events, i.e. telic events. Take English for example. An accomplishment verb like *eat* denotes a set of events, the stages of which are qualitatively the same. In order to get a maximal eating event to be achieved when Max<sup>E</sup> applies, an externally given scale is needed to which an event is maximal. In the case of *eating*, the referent of the internal argument (such as *an apple* in *eat an apple*) serves as the external scale, as when this scale is taken into consideration, the stages of the eating event will be different, and Max<sup>E</sup> will pick out the maximal event of eating the whole apple.

Based on the assumption of the maximalisation operator MaxE, Filip & Rothstein (2000) further argue that the cross-linguistic variation concerning telicity happens because Max<sup>E</sup> applies at different levels across languages. In particular, in Germanic languages, Max<sup>E</sup> applies at the VP level, which means that the composition of the semantics of the verb and the object plays a central role in determining telicity, as is shown in the case of *eat three apples*. Any information in the VP domain will be taken as resources for Max<sup>E</sup> to apply. The direct object plays a role because it is legitimate to be taken as the external scale. The lexical meaning of the verb also plays a role because in most cases only incremental

### 16 Functional items, lexical information, and telicity

verbs can denote an event that can take the internal argument as its external scale. That's why *eat an apple* can be taken to denote a telic event, while *carry an apple* denotes an atelic event. In Slavic languages, the Max<sup>E</sup> operator applies at the verbal level. Therefore, only lexical information at the verbal level can be taken in the application of the Max<sup>E</sup> operator. The perfective prefix is taken as a derivational affix, changing the lexical meaning of the verb. In particular, the prefix has a measure function, enabling an otherwise non-atomic predicate to denote a maximal event. Therefore, without resorting to the lexical information beyond V, such as the object, already the Max<sup>E</sup> operator can apply, because a maximal event is denoted by the verbal predicate. Filip & Rothstein (2000) did not make explicit the grammatical nature of this operator, but from what they explicitly proposed, the application of this operator is a pure semantic operation, and therefore there is no syntactic node corresponding to the operator. That's why this account is referred to here as a lexicalist approach, because Max<sup>E</sup> operator in the account takes the function of changing the denotation of the predicate (V or VP).<sup>1</sup>

Filip & Rothstein (2000) have captured the surface differences of telicity between Germanic and Slavic languages. The major problem is as follows: this approach relies on the different domains of quantification imposed by a null Max<sup>E</sup> operator. We may further ask what determines this domain (V or VP). Or to put it in another way, why does this operator apply selectively when taking effect in different languages? Also, this assumption is not in line with the recent Minimalist view of linguistic variation, especially the Borer–Chomsky conjecture (cf. Baker 2008; Roberts & Holmberg 2010) which reduces variation to feature related factors in the lexicon.

### **3.2 The syntactic approach**

The syntactic approach, taken by researchers like Borer (2005a,b), MacDonald (2008), and Travis (2010), assumes that telicity or inner aspect is encoded in the syntax. Here I will concentrate on Borer's (2005a; 2005b) exo-skeletal (XS) based account of telicity, which will also be taken as the theoretical framework for the issues to be explored in this paper.

Like other research by Bach (1986) and Rothstein (2004), the XS model captures the semantic parallelism between the domain of events and that of objects. The

<sup>1</sup>An anonymous reviewer suggests that Max<sup>E</sup> operator might also be a syntactic feature. I agree with this possibility, although this is not really the proposal in the original account, which takes the application of Max<sup>E</sup> as a pure semantic operation. In addition, if Max<sup>E</sup> is a syntactic feature, it should be specified on the same functional head, and it will be difficult to explain why this feature applies to V and VP respectively in different languages.

### Xuhui Hu

XS model takes a step further by specifying two parallel functional structures encoding events and nominals. The functional structures encoding events and objects, which are EP (event phrase) and DP (determiner phrase) respectively, both involve a quantity head and a deictic head (E in EP and D in DP) that anchors the entity (either an event or an object). In an extended projection, i.e. functional structure, it is assumed that each functional head specifies an open value, which has to be assigned range so that the semantic function can be available for the interpretation of the structure. Range assignment can be either direct or indirect. The direct range assignment involves the merging of a functional item to the corresponding functional head. A functional item can be an independent morpheme termed "f-morph". *Will* in English is such an f-morph which assigns range to the open value specified on the T head. A functional item can also take the form of a bound morpheme termed "head feature", such as the English past tense affix *-ed*. The indirect range assignment can be instantiated by an adverb of quantification, a discourse operator,<sup>2</sup> and specifier–head agreement (Borer 2005b: 18). Range assignment via specifier–head agreement means that the open value specified on a functional head can be assigned a range if the phrase in the specifier position contains this range.

Borer (2005a,b; 2013) postulates that the underlying reason for linguistic variation is tied to how an open value is assigned range. For example, variation might arise from whether the range is assigned in the shape of a bound morpheme or a functional item, or whether the range assignment is achieved directly or indirectly. While there are various definitions of interpretable and uninterpretable features (cf. Pesetsky & Torrego 2004), in general the pair of open value and range is the equivalent to the pair of uninterpretable and interpretable features. Therefore, for the ease of exposition, in the rest of this paper, I will use the terms of uninterpretable and interpretable features.

Following the Davidsonian approach (Davidson 1967; 1980; Parsons 1990), Borer (2005a) argues that the functional structure EP is responsible for the derivation of the interpretation of events, including that of the event participants as well as the temporal situation of the event, i.e. inner aspect. The extended projection, EP, starts from a lexical item, often a verb, which is dominated by several functional heads in a fixed and universal hierarchical structure, represented as follows:

<sup>2</sup>The accurate mechanism of the range assignment by an adverb of quantification or a discourse operator is not elaborated on in Borer's system. This type of range assignment is not directly relevant to our account, and thus I will not explore it further.

16 Functional items, lexical information, and telicity

While according to the lexicalist approach to argument structure (cf. Chomsky 1970; Reinhart 2003), the roles of event participants are projected by the predicate which is embedded at the bottom of the functional structure, in the XS model, the predicate does not contain any syntactic information such as the thematic grid. The interpretation concerning the theta roles of event participants and telicity is derived from the functional structure EP, and the predicate only provides conceptual meaning that modifies the functional structure. The Asp<sup>Q</sup> head in EP is the counterpart of the quantity head in DP, responsible for the quantification of the event, and the valuation of the feature specified on this head is the source of telic interpretation. Thus, in the XS model, telicity comes from the valuation of the quantity feature specified on the Asp<sup>Q</sup> head. In languages like English, the valuation of the quantity feature is often achieved via specifier–head agreement, which can copy the quantity value of a quantity DP in the specifier position of the AspQP onto the Asp<sup>Q</sup> head, thereby giving rise to the interpretation of telicity.

We can take the following examples to illustrate the feature valuation of quantity in EP:


Following the XS model, in (9) it is the DP *three apples* in the specifier of the AspQP that provides the interpretable quantity feature to value the uninterpretable quantity feature on the Asp<sup>Q</sup> head. The valuation of the quantity feature then gives rise to the semantic interpretation of the telicity of the eating event. On the other hand, in (10), the bare plural *apples* does not bear an interpretable quantity feature, which means in this sentence, if the Asp<sup>Q</sup> head projects, the valuation of the quantity feature, and hence telic interpretation, cannot be achieved.

Just like the DP structure, in EP the functional head specifying the quantity feature is optional, which is exactly the case of atelic events. When Asp<sup>Q</sup> head

### Xuhui Hu

does not project, which is the case of atelic events, a layer of F*s*P will appear in the otherwise AspQP position, and the [Spec F*s*P] position will host a DP that is the theme of the event.<sup>3</sup> Since this paper focuses on telicity, F*s*P will not be discussed.

As mentioned above, in English, the quantity feature on the Asp<sup>Q</sup> head is valued with the indirect strategy: copying the quantity feature of a DP in [Spec AspQP] via agreement. In theory, it is possible that in some languages the direct valuation strategy might be available, if there is a functional item in the lexicon that bears an interpretable quantity feature. In Borer (2005b), it is shown that this situation does exist in some Slavic languages. In languages like Czech, a perfective prefix serves as an event delimiter, which imposes a telic interpretation on the one hand, and also restricts the interpretation of bare nominal arguments by providing them with quantificational force:

(11) Czech (Filip 1997: 62)

a. Pil*<sup>I</sup>* drank-sg víno. wine-sg-acc 'He was drinking (the) wine.' b. Vypil*<sup>P</sup>* pfv-drank-sg víno. wine-sg-acc 'He drank up (all) the wine.'

In the above example, the prefixed perfective verb gives rise to a telic interpretation. In addition, the prefix also forces a definite and quantity reading on the bare object, as is shown in (11b). Without the perfective prefix, no telic reading is attested, and the bare noun does not need to take a definite reading or quantity reading as shown in (11a).

Borer (2005b) takes such data as evidence of the paradigm of direct range assignment (feature valuation). In particular, a perfective prefix in Slavic languages is a functional item that bears the interpretable quantity feature, which is directly merged in the Asp<sup>Q</sup> head to value the uninterpretable quantity feature ([*u*Quan] for short). In addition, when a bare nominal theme argument is involved, the perfective prefix copies the quantity feature to the quantity head in the DP structure, and provides a strong D feature to value the uninterpretable D feature ([*u*D]) on the D head of the DP, as is the case in (11b).

<sup>3</sup> In Borer's (2005b) original model, the nominal in the [Spec AspQP] takes the role of "subject of quantity", while the nominal in the [Spec F*<sup>s</sup>*P] position takes the role of "default participant". Abstracting away technical details that do not concern the discussion in this paper, and for ease of exposition, I will simply use the term "theme" to refer to the DPs in the [Spec AspP] and [Spec F*<sup>s</sup>*P] positions.

16 Functional items, lexical information, and telicity

## **4 Telicity in Yixing Chinese**

Chinese is not considered in previous studies on the cross-linguistic variation of telicity, mostly because how telicity is derived exactly in Chinese is not well understood.<sup>4</sup> The inner aspect of Chinese is widely mentioned in the vast literature on the famous verb particle *le*, which is often assumed to be related to telicity in one way or another (cf. Smith 1997; Lin 2003; Soh & Gao 2007; Soh 2009; 2014). However, the mechanism of telicity in Chinese is by no means clear, partly because the verbal *le* does not always give rise to telicity:

### (12) Mandarin


Obviously, without knowing the precise factor determining telicity in Chinese, it is impossible to bring Chinese into the broad picture of telicity parameter. In Hu (2016), I used data from Yixing Chinese<sup>5</sup> to show that the verbal *le* in Mandarin is not a homogeneous category, but is the phonological realisation of two homonymous categories, one being the inner aspectual marker, which is the counterpart of *lə* in Yixing, and the other is an outer aspectual marker, which corresponds to *dzə* in Yixing. In this section, I present the major properties of *lə* that are closely related to the central topic of this paper, i.e. the variation of telicity, while leaving other properties aside (but see Hu 2016). To distinguish *lə* from *dzə*, the properties of *dzə* will also be presented when necessary.

In Yixing, all the achievement and accomplishment predicates with an incremental theme can occur in a *lə*-marked sentence. In addition, when *lə* occurs, a telic interpretation arises invariably, as is evidenced by the compatibility with

<sup>4</sup> In the final stage of proofreading this paper, I was informed that Peng (2017) discovered that in Chinese dialects like Pingxiang, there are also two distinct particles that both correspond to the verbal *le* in Mandarin. Peng also shows that one particle is a telic marker, which further supports the analysis made here. I would like to emphasise that I am by no means the first to correlate telic function with verbal *le* in Mandarin. The crucial point made in this paper is how parameters of telicity could be derived with such linguistic phenomena.

<sup>5</sup>Yixing Chinese is a variety of Wu dialect, spoken in Yixing county with a population of 1,243,700, a subdivision of Wuxi city in China's Jiangsu province.

### Xuhui Hu

the Chinese version of the *in x time* phrase. That is, a *lə*-marked sentence always denotes an atomic event in the sense of Rothstein (2004).


There is evidence that telicity is not the information taken by the predicate in Yixing. Instead, telicity is directly related to *lə*, as when *lə* is not available, even with a quantised incremental theme and a homomorphic predicate, still a telic interpretation cannot be attested. For example, the examples in (15) will be unacceptable if *lə* is replaced by another verbal particle or if there is no particle at all:

(15) Yixing Chinese


The examples in (13) and (14) on the one hand, and (15) on the other form a minimal pair, clearly indicating that what plays a crucial role in yielding telic interpretation is the particle *lə*.

What further augments the above descriptive conclusion is that *lə* may also force the event with an activity or state predicate to yield a telic interpretation, although such predicates usually appear in atelic events in Vendler's classification.

<sup>6</sup>∅ stands for zero-particle, i.e., the situation when no particle occurs.

<sup>7</sup>The verbal particle *go*, which is the counterpart of *guo* in Mandarin, often indicates that an event happened before a certain time but does not have an effect on the topic time (cf. Smith 1997; Soh 2014).

### 16 Functional items, lexical information, and telicity

### (16) Yixing Chinese


The events denoted by the above two sentences are telic, evidenced by the adverbial *in x time*. Although these two sentences involve an activity predicate and a stative predicate respectively which in most cases appear in atelic sentences, the telic interpretation is obligatory because of the presence of *lə*. What is especially noteworthy is that although *kaeʃiŋ* 'happy' is often used as an adjective, it takes a dynamic reading here denoting a change of state, roughly equivalent to *to become happy* in English. As illustrated below: without *lə*, the above sentences will be unacceptable:

	- a. \* ʣaŋsa Zhangsan sasə thirty fəŋʣoŋ minutes lidou in tae push ∅ / ʣə / go ∅ / ʣə / go sa three ʦɔ cart ho. good. intended: 'Zhangsan pushed three carts of goods in 30 minutes.'<sup>8</sup>
	- b. \* ʣaŋsa Zhangsan ʤiŋʣao today jɨ one te day lidou in kaeʃiŋ happy ∅ / ʣə / go ∅ / ʣə / go sa three ʦi. time. 'Today, Zhangsan became happy three times in one day.'

The previous studies on the verbal *le* in Mandarin mainly focus on the semantic effects relevant to the event, such as whether it denotes a completion or termination of an event and whether it signals the realisation of a state that holds at the topic time. The possible relationship between verbal *le* and the nominal theme is never considered. With Yixing data, the quantificational effect of *lə* over the nominal theme of the event is brought to our attention. In Yixing, *lə* occurs in a sentence where the nominal theme has a quantity reading. Whenever a non-quantity reading is imposed on the nominal theme, the sentence will be

<sup>8</sup>The context of this sentence is exactly that of (16a).

### Xuhui Hu

unacceptable. As I will show shortly, bare nouns can be the theme of the verbs marked with *lə*; when this occurs, the bare noun will not have the mass reading or bare plural reading,<sup>9</sup> but will be forced to take a specific and quantity reading. Therefore, the requirement of the quantity theme can be met in two situations. Firstly, *lə* can co-occur with a nominal that involves an overt numeral, clearly indicating quantity:

(18) Yixing Chinese tɔ He ʧε eat *lə* lə sa three ʣə clf bɪŋgo. apple. 'He ate three apples.'

If a bare nominal occurs as the theme with a mass or bare plural interpretation, the sentence will be ungrammatical:

(19) Yixing Chinese


A bare nominal theme, if it is to be compatible with *lə*, must have a quantity and definite specific reading. This quantity/definite reading is possible when the bare nominal is fronted to a topic construction. Three positions are possible if the object is taken as the topic in Yixing (and in Mandarin): clause initial position, the position in between the subject and the verb (SOV), and the complement position of *ba*, <sup>10</sup> which is *nɔ* in Yixing. These three positions all can hold the bare nominal object when it co-occurs with *lə*. <sup>11</sup> In the following examples, the bare noun *ʤu* 'alcohol' has a quantity and specific interpretation: for such examples to be grammatical, it has to mean a certain quantity of alcohol, as well as the

<sup>9</sup>Like Mandarin, Yixing does not have a plural marker in general, so nominals with mass and bare plural readings both have the form of bare nominals.

<sup>10</sup>In Mandarin, *ba* occurs after the subject and takes the object in its complement position, where the object is often interpreted as the topic. Its counterpart in Yixing is *nɔ*, which works exactly like *ba*. For a comprehensive description and analysis of the *ba*-construction, see Huang et al. (2009: 153–196).

<sup>11</sup>This description ignores possible underlying structural differences, which aren't crucial here.

### 16 Functional items, lexical information, and telicity

presupposition that this quantity of alcohol is known to both the hearer and the speaker.<sup>12</sup>

(20) Yixing Chinese

ʤu alcohol ŋo I jiʤiŋ already ʧε eat *lə* lə lɨ. lɨ.

'I have drunk the alcohol (i.e. the certain amount of alcohol has been drunk up by me).'

(21) Yixing Chinese

I

ŋo ʤu jiʤiŋ ʧε *lə* lɨ.

alcohol already eat lə lɨ.

'I have drunk the alcohol (i.e. the certain amount of alcohol has been drunk up by me).'

(22) Yixing Chinese

ŋo I jiʤiŋ already nɔ nɔ ʤu alcohol ʧε eat *lə* lə lɨ. lɨ.

'I have drunk the alcohol (i.e. the certain amount of alcohol has been drunk up by me).'

We can thus draw a descriptive conclusion: a quantity theme is an obligatory requirement of *lə*. This requirement is met when a nominal phrase already takes a quantity feature provided by the numeral; if a numeral is not available as is in the case of bare nominals, *lə* seems to "offer" a quantity interpretation. In Hu (2016), following Borer (2005b), it is argued that this is made possible because the quantity feature of the telic item *lə* is copied onto the nominal in the [Spec AspQ] position, thus presenting a symmetry with the situation in English: In English, without a functional telic item, the quantity feature of a DP in [Spec AspQ] has to be copied on the Asp<sup>Q</sup> head, while in Chinese, with the feature provided by the telic item, the quantity feature on Asp<sup>Q</sup> head is copied onto the nominal phrase in [Spec AspQ].

# **5 Exploring the telicity parameter**

### **5.1 An initial account**

So far, with both Chinese, English and Slavic data, it seems that telic variation can be neatly accounted for with Borer's XS model. All the cross-linguistic issues

<sup>12</sup>For an account of fronting the object in these examples, see Hu (2016).

### Xuhui Hu

can be reduced to a single factor: whether there is a functional item specifying quantity feature in the lexicon13, which can be directly merged in the Asp<sup>Q</sup> head to value the uninterpretable quantity feature on this functional head ('Asp quantity feature', to be distinguished from the quantity feature of DP).

The parameter of telicity can therefore be summarised below:

(23) Telicity parameter (first version)

Does the lexicon contain a functional item bearing an Asp quantity feature?


In the above division, I use the term direct telicity language (DT language for short) to refer to languages that contain a functional item to directly value the feature on the inner aspectual head, while indirect telicity language (IT language for short) refers to those that have to adopt an indirect mechanism such as spechead agreement to value this feature. The above single parametric factor results in the cluster of differences in Table 16.1.



*a* (Chinese, Slavic Languages)

*b* (English and other Germanic languages)

Table 16.1 shows that a single telicity parameter based on the existence of a functional item bearing the quantity feature is the underlying reason for a range of cross-linguistic variations. I have already shown at different points in this

<sup>13</sup>I am assuming the proposal initiated in Distributed Morphology (cf. Halle & Marantz 1993; Marantz 2007) that a lexicon of a language has both functional items and lexical items, with the former specifying features to be engaged in syntactic computation, and the latter mainly takes conceptual meaning.

### 16 Functional items, lexical information, and telicity

paper that for Chinese and Slavic languages, telic reading does not rely on the quantity object. What is crucial is the presence of the inner aspectual functional item. On the other hand, for languages like English which lacks such a functional item, the indirect feature valuation of the quantity feature on the inner aspectual head is taken, which relies on the copying of the quantity feature of the DP in the [Spec AspQP] position. This explains why languages like English have to rely on the quantity nominal object to derive telic reading. The same parametric difference also directly explains why quantification over bare nominals only occur in DT languages: the functional item bearing a quantity feature can scope over the bare nominal in the [Spec AspQP] position, while in IT languages like English, without such a functional item, naturally this type of quantification is impossible.

So far, the parametric account of telicity is completely based on Borer's (2005b) XS model, and this paper provides data from Yixing to further support this account: Borer's account predicts that it is possible that a quantity functional item might exist in other languages other than Slavic languages, and Yixing data confirm this prediction.

### **5.2 Telicity parameter: A further specification**

Huang (2015) points out that Chinese verbs seem to be inherently atelic, which can be illustrated by the following examples,<sup>14</sup> an observation also noticed in Tai (1984) and Smith (1997).

(24) Yixing Chinese

ʣaŋsa Zhangsan sasə thirty fəŋʣoŋ minute lidou in ʃε write lə lə sa three foŋ clf ʃiŋ, letter, dazi but jɨ one foŋ clf a even mə wəʣəŋ.

not complete.

'Zhangsan wrote three letters in thirty minutes, but none of the letters was completed.'

Even with the marker*lə* in Yixing, still the action does not seem to have an endpoint because the three letters are not finished. In order to guarantee the information of completeness, a "completeness particle" has to be attached to the verb. Note that there are different completeness particles in Chinese, which match different verbs. This fact carries over to Yixing. Below I use a Yixing example for the sake of consistency, wherein *wə* (counterpart of *wan* in Mandarin) is a completeness particle.

<sup>14</sup>Since the verbal *le* in Mandarin can be either an outer or inner (telic) marker, to avoid confusion, we use Yixing in this paper to illustrate relevant points in Chinese.

### Xuhui Hu

(25) Yixing Chinese

ʣaŋsa Zhangsan sasə thirty fəŋʣoŋ minute lidou in ʃε-*wə* write-finish lə lə sa three foŋ clf ʃiŋ, letter, \*dazi but jɨ one foŋ a mə wəʣəŋ.

clf even not complete.

'Zhangsan finished writing three letters in thirty minutes, but none of the letters was completed.'<sup>15</sup>

This appears to contradict the assumption that *lə* is a telic marker that imposes telic interpretation. Huang's (2015) explanation is that Chinese verbs are inherently atelic and thus a completeness particle is required to denote telic events. However, this account is problematic considering the fact that the completeness particle on the one hand cannot guarantee the derivation of telicity, and on the other hand telicity can arise even without such particles. As I have already shown in this paper, without *lə*, a telic sentence will not be acceptable, and the data also show that many *lə*-marked telic sentences do not need completeness particles. So here is a puzzling issue: on the one hand, *lə* does seem to take the responsibility of marking telicity, but on the other hand, without a completeness particle, the event, at least in the traditional assumption, does not always express an endpoint.

To address the above puzzle, the clarification of "endpoint" is crucial. Concerning the data in (25), "endpoint" is often understood as the point when the whole event is measured out by the theme argument: if the matrix verb is a consumption verb, the endpoint is understood as the point when the final bit of the food is consumed. If we think further, we will realise that this type of endpoint does not equate to the endpoint in defining telicity. The interpretation of a telic event comes from the linguistic expression of an endpoint of an event. Here, what is at stake is that the linguistic derivation explicitly provides the information that the event ends at a point, regardless of whether this is the point when the event is measured up by the object. With this in mind, I posit the following hypothesis:

(26) *lə* in Yixing (and the inner aspectual *le* in Mandarin), as a pure telic functional item, provides the abstract meaning (semantic feature) that the event ended at a certain point.

Note that the above hypothesis about the semantic contribution of *lə* is a natural consequence of the assumption that *lə* is a pure telic marker. Since, as I

<sup>15</sup>Note that the symbol "\*" in (25) does not mean the clause itself is ungrammatical, but only shows that the information expressed by the clause contradicts that of the preceding clause. This also applies in the following examples of this section.

### 16 Functional items, lexical information, and telicity

have pointed above, telicity is characterised with the endpoint of an event, a telic marker should be responsible for introducing this semantic characteristic. This means that as long as*lə* is involved in the sentence, an endpoint is expressed linguistically, and this endpoint does not have to be the point when the event is measured up by the object. For the convenience of exposition, I term the latter type of point as measuring up point, to be distinguished from endpoint.

For *lə*-marked sentences with canonical accomplishment verbs, the default interpretation is that the event reaches an endpoint when the event is measured up by the object. This is so because to the language user, the measuring up point is the most accessible endpoint of such events. However, as long as there is supporting contextual information, such a default interpretation can be cancelled. This is exactly the situation of the example in (24). The second clause indicates that the event was not measured up by "the three letters". Since the telic marker *lə* provides the explicit information that the event of writing three letters ended at a certain point, and since the second clause cancelled the identification of the endpoint with the measuring up point, we are forced to take another interpretation that in this event, the agent had the plan to write three letters, but he ended this writing event without completing any of the three letters. This event is still telic, because it is explicitly expressed that the event arrived at an endpoint, while how much the agent had written for each letter was not specified. The *in x time* diagnostic in (24) also shows that this sentence, although lacking a measuring up point, is telic.

Now we have to address this question: why is it so that for the telic accomplishment events in English, the endpoint is always identified with the measuring up point?


This question can be addressed by the mechanism of feature valuation for the derivation of telicity. Note that in English, no functional telic item is available in the lexicon, and telicity arises when the quantity feature of the object DP is copied onto the Asp<sup>Q</sup> head. A semantic consequence of this syntactic operation is that the endpoint expressed by the sentence must be identified with the measuring up point: after all, the interpretable quantity feature assigned to the inner Asp<sup>Q</sup> head is exactly the quantity feature of the object DP, and the identification of the endpoint and the measuring up point is the reflection of this syntactic operation. This identification is imposed by syntactic operation, and therefore is

### Xuhui Hu

not cancellable, but part of the semantic meaning that contributes to the truth value of the proposition.

Following the analysis developed so far, it can be predicted that if a language has a functional item that can directly value the feature on the inner Asp<sup>Q</sup> head, the situation of the Chinese example in (25) should also occur in this language. But the Slavic data seem to invalidate this prediction: I have argued, following Borer (2005b), that the perfective prefixes in Slavic languages are functional items that provide the interpretable quantity feature to the inner Asp<sup>Q</sup> head. This implies that such prefixes are in nature equivalent to *lə* in Yixing. Considering the hypothesis in (26), the identification of the event endpoint with the measuring up point in Slavic languages should also be a piece of cancellable default information. This prediction does not hold, though, as shown by the following Czech examples:<sup>16</sup>

### (29) Czech

a. Dneska Today jsem I *na*psal wrote-pfv-sg pět five-acc dopisů, letter-pl-poss \*ale but ani even jeden one z of nich them jsem I ne-*do*psal. not-finished writing-pfv-sg

'Today I wrote five letters, but I finished none of them.'

b. Během Within pět five-poss měsíců month-pl-poss jsem I *pře*četl read-pfv-sg dvě two-acc knihy, book-pl-acc \*ani even jednu one z of nich them jsem I ne-*do*četl. not-finished reading-pfv-sg 'In five months I have read two books, but I did not finish reading either of them.'

I postulate the following hypothesis on the nature of perfective prefixes in Slavic languages:

	- *Function A (Semantic function):* A Slavic perfective prefix functions as a lexical particle that enriches the lexical information, i.e. conceptual meaning, of the verb; more specifically, it provides the information entailing the identification of the endpoint of an event with the measuring up point.

<sup>16</sup>I thank Eva Roubalová for providing these two examples and Nong Xi for providing further clarification with the data.

### 16 Functional items, lexical information, and telicity

*Function B (Syntactic function):* A Slavic perfective prefix can work as a functional item that values the quantity feature on the inner Asp<sup>Q</sup> head.

Function A is invariable, while function B is optional.

Explanation of the above hypothesis is in order. Function A is largely in line with the proposal made in Filip (2005) and Filip & Rothstein (2000) that Slavic perfective prefixes are lexical operators applied to verbal predicates, an assumption also in line with Partee (1995). This assumption equates prefixes in Slavic with Chinese completeness particles, both contributing concrete lexical information to the predicate. In fact, when Slavic prefixes and Chinese completeness particles are viewed together, we can see their surface similarities: both Slavic prefixes and Chinese completeness particles have different forms corresponding to different predicates; both contribute the lexical information of the entailment that the endpoint of the event is the measuring event. Note that if we take the assumption that telicity is the result of feature valuation, then such lexical items, either the Slavic prefixes or Chinese completeness particles, cannot give rise to telicity because they are not functional (inflectional) items. This is the case in Chinese: I have shown that Chinese completeness particles alone cannot yield telicity. But Slavic prefixes do have the function of yielding telicity. This is due to function B: in addition to the lexical information, Slavic prefixes also take an interpretable feature, i.e. the quantity feature. This explains why on the one hand telicity is yielded by the perfective prefix in Slavic, and on the other hand, the entailment of the identification of the endpoint with the measuring point is also attested.

It then follows that a Slavic perfective prefix takes the functions undertaken by completeness particles and *lə* respectively in Chinese. Therefore, a prediction we can make is that to denote the semantic information yielded by the Slavic perfective prefix, i.e. both telicity and the endpoint identification with measuring up point, in Yixing both a completeness particle and *lə* are required. This is exactly the case of the example in (25).

The hypothesis in (30) also claims that function B, i.e. the function of serving as a functional item, is optional. The consequence is that we can expect that more than one prefix can be stacked on a single verb in Slavic, with only one prefix undertaking function A. This is a fact pointed out by Filip & Rothstein (2000), who take this fact to argue against Borer's (2005b) hypothesis of taking perfective prefixes as telic functional items: if a perfective prefix is a functional item, it should not be expected to co-occur with another perfective prefix attached to the same verb. Now with the hypothesis in (30), it is clear that when the stack of

### Xuhui Hu

two perfective prefixes occurs, one of them is only a lexical particle,<sup>17</sup> while the other serves as a telic functional item.

The above analysis shows that the DT languages in Table 16.1 do not constitute a homogeneous type, but can be further divided depending on the properties of the telic functional item. The parameter hierarchy of telicity can thus be enriched, presenting a more fine-grained picture below:

(31) Telicity parameter (revised version)

It should be noted that only a small number of languages are mentioned in the above hierarchy. I assume that most, if not all, languages can fit into this hierarchy depending upon the properties of their lexicons regarding telic functional items. The above hierarchy of parameters follows the spirit of parametric variation articulated in Roberts & Holmberg (2010) as well as the ReCoS project (Roberts 2010), which resolves the tension between micro and macro parameters. In this sense, this research contributes to the research of ReCoS by adding a new hierarchy of parameters to the broad picture of comparative syntax detailed by the various studies conducted in this project. Another potential contribution of this analysis is that the BCC style parametric variation can be reduced not only to the formal features in the lexicon, but also to whether a functional item also bears some lexical information. This assumption is not confined to the analysis of telic parameters, but also carries over to a wide range of syntactic issues in Chinese (cf. Huang 2015; Hu 2018: Ch. 4).

<sup>17</sup>This assumption implies that if a perfective prefix only takes function A, it is not a functional item but a lexical one, and hence should be merged in a position different from that of both function A and B. In this paper, we do not go further to explain the syntactic position of the lexical item. We concur with Basilico (2008) that it is a Root merged to the Root of the verb.

### 16 Functional items, lexical information, and telicity

The classification of the linguistic properties listed in Table 16.1, accordingly, needs to be expanded by adding the further classification in Table 16.2.


Table 16.2: Further classification of DT languages

## **6 Conclusion**

This paper takes a basic theme of Borer (2005b) that telicity is not part of lexical information, but the result of syntactic derivation via the standard feature valuation mechanism in Minimalism (Chomsky 2000; 2001). I argue, relying on the description and analysis of Yixing Chinese in Hu (2016), that like Slavic languages, Chinese also has a functional telic item that values the quantity feature on the inner aspectual head. Further, it is proposed that Chinese differs from Slavic languages in that the telic functional item in the latter also contributes lexical information that entails the identification of event endpoint with the measuring up point. A tiny difference in these properties will lead to a cluster of variation among languages, which is in line with the broad implication of the hierarchy of parameters articulated in Roberts & Holmberg (2010) as well as the ReCoS project summarised in Roberts (2010). With a close scrutiny of the completeness particles and verbal *lə* in Yixing Chinese and the perfective prefix in Slavic languages, this paper also explicates how the lexical information and syntactic features specified on these items (particles or prefixes) interact in deriving the surface semantic interpretation. This line of explanation thus provides further issues and perhaps new perspectives on the recent assumptions about semi-functional items in Huang (2015) and the multi-functionality hypothesis of particles in Biberauer (2017a,b).

### Xuhui Hu

## **Abbreviations**


# **Acknowledgements**

This research has benefited from discussions with Theresa Biberauer, Alison Biggs, C.-T. James Huang, Joe Perry, and Ian Roberts. Parts of this paper were presented at the 24th annual conference of the International Association of Chinese Linguistics (IACL 24) at Beijing Language and Culture University (BLCU), July 2016, and at the Australian Linguistic Society conference, Monash University, Melbourne, December 2016. Yingyi Li, Yuchen Liu and Zhouyi Sun provided important help with typesetting and proofreading the manuscript. The author is responsible for all potential errors. This research is funded by the National Social Science Fund of China, No. 18BYY044, "Neo-constructional approach to syntax: With special reference to Yixing dialect".

## **References**


16 Functional items, lexical information, and telicity


### Xuhui Hu


16 Functional items, lexical information, and telicity


# **Chapter 17**

# **Categorizing verb-internal modifiers**

# Chenchen Song

University of Cambridge

In this chapter, I propose a novel theory to explain the syntactic and semantic characteristics of a class of previously lesser studied verb modifiers, namely the non-heads of compound verbs like *double-check* and *hand-wash*. Such verb-internal modifiers are more widely used in English contrary to common impression, and they are also a very productive compounding strategy in East Asian languages like Chinese and Japanese. However, in the familiar European languages they are either systematically missing (e.g. in Romance languages) or subject to odd movement constraints (e.g. in German), even when these languages do not equally lack compound nouns. The theory I propose makes use of a defective categorizer that bears a lexically unvalued categorial feature. It agrees with the categorial feature on the base verb and results in a word-internal adjunction structure. The model is solely based on Simplest Merge without resorting to Pair Merge or Root incorporation, can be readily extended to the nominal domain, and nicely relates the typology of verb-internal modifiers to the parametrization of verb movement.

# **1 Introduction**

There is a long line of syntactic research on verbal modifiers (VMs, É. Kiss's 2002 term), most fruitfully on verbal particles, as represented by those in Germanic languages (e.g. German *ein-kaufen*<sup>1</sup> 'in-buy; to shop', cf. Dehé et al. 2002, Haiden 2006, and references therein) and Hungarian (e.g. *ki-mos* 'out-wash; to wash out', cf. É. Kiss 1987, 2002, 2008, Hegedűs 2013). A standard view on the particle-like VMs is that they are base-generated as V-complements, e.g. in small clauses (Taraldsen 1983, Kayne 1985 et seq.). They are modifiers in the broad sense that non-heads in a phrase enrich the head's meaning.

<sup>1</sup>The hyphen is used for expository convenience and does not indicate orthography.

Chenchen Song. 2020. Categorizing verb-internal modifiers. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 357–384. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972866

### Chenchen Song

There is still another type of VM which has received comparatively less attention in traditional generative studies. Observe the examples in (1).

(1) *double*-check, *second*-guess, *proof* -read, *dry*-clean, *hand*-wash, *stir*-fry, *sleep*-walk, *window*-shop, *baby*-sit, *breast*-feed, *hitch*-hike …

While the italicized components in (1) are intuitively also modifiers, these complex verbs are traditionally treated as compounds, i.e. lexical items, whose internal structures are a matter of morphology rather than syntax. In other words, the VMs in (1) are word-internal; call them verb-internal modifiers (VIMs). Unlike verbal particles, VIMs are neither V-complements nor secondary predicates. Rather, they modify the base verbs in the same way adverbs modify VPs.

Contrary to the common impression that compound verbs are unproductive in English, English speakers are evidently no less capable of creating items like (1) 2 than e.g. speakers of Chinese, which is considered very productive in compound verbs.<sup>3</sup> If compounding is part of our language competence, it should be subject to general linguistic principles and, crucially, only rely on computational mechanisms made available by Universal Grammar (UG), hence no compoundingspecific rule. Distributed Morphology (DM, Halle & Marantz 1993 et seq.) treats syntax (essentially Merge, Hauser et al. 2002) as the only generative engine in the human language faculty (single engine hypothesis, Marantz 2001). I take this as my point of departure.

With these theoretical advances, many issues about compounding need to be carefully rethought, as witnessed by the numerous works within DM (i.a. Zhang 2007; Harley 2009; Hu 2013; Nishiyama & Ogawa 2014; Bauke 2016; de Belder & van Koppen 2016; Song 2017b). This chapter furthers this exploration by putting forward a new perspective on the structure of VIMs. To be specific, I categorize VIMs via a lexically unvalued "defective categorizer" and assign them the categorial value of the base verbs via Agree. This new model has three major advantages. First, it is solely based on Simplest Merge and labeling (Chomsky 2013), making no use of Pair Merge or Root incorporation. Second, it can be extended to the nominal domain, unifying verbal and nominal compounding. Third, it relates the typological availability of VIMs to the parametrization of verb movement.

This chapter is organized as follows. In §2, I illustrate the categorial properties of VIMs with cross-linguistic data, concluding that they are simultaneously

<sup>2</sup> Syntacticians are contributing quite a bit to this list. A quick Google search finds the following examples in the published literature: *set/pair/self-merge*, *head/phrasal/A/Ā/wh-move*, *left/rightadjoin*, etc. All are attested in the prs.3sg form, so they are unequivocally used as verbs.

<sup>3</sup>The productivity of compound verbs is influenced by multiple factors, e.g. (1)-type compounds in Chinese are extremely productive because they form standard prosodic words (Feng 1997).

### 17 Categorizing verb-internal modifiers

lexical and functional and qualify as word-internal adjuncts. In §3, I review two minimalist approaches to adjunction, arguing that the labeling-based model is more favorable. In §4, I propose and motivate such a model, featuring a defective categorizer and a Root-joining schema. In §5, I further discuss the theoretical and typological predictions of the model. §6 concludes.

## **2 The categorial status of verb-internal modifiers**

As a general observation, VIMs can be of any lexical category, as in (2).<sup>4</sup>

	- b. Chinese *shǒu*N-xǐe 'hand-write; to handwrite', *zǒu*V-dú 'walk-read; attend a day school', *dà*A-xiào 'big-laugh; to laugh loudly'
	- c. Japanese

*se*N-ou 'back-carry; to carry on back', *oshi*V-taosu 'push-topple; to push down (topple by pushing)', *chika*A-zuku 'close-attach; to get near'

One may be tempted to conclude that VIMs simply belong to their separate lexical categories. This conclusion is problematic in several ways. First, it misses the generalization that VIMs, whatever their lexical source, all perform the same function (i.e. modification). This issue does not arise in traditional studies where VIMs have no syntactic relevance whatsoever, but in the single-engine approach, we need to syntactically formalize this "beyond-lexical" equivalence class.

Second, even the lexical labels themselves may not be tenable, for VIMs and the respective lexical categories do not have much in common beyond the superficial resemblance. Consider the "N" modifiers in (2). They repel typical nominal distributions such as pluralization and quantification in English (3a), classification in Chinese (3b), and adjective modification in all the three languages (3c).

	- b. \* yì zhī shǒu-xiě 'one clf hand-write'
	- c. \* pretty hand-wash, \* qiǎo shǒu-xiě 'skillful hand-write', \* aoi se-ou 'blue back-carry'

<sup>4</sup>Chinese and Japanese have no P-origin VIMs because they lack the English-type P items (cf. Huang et al. 2009; Tsujimura 2013; Song 2017a). I leave P-related issues aside due to space limit.

### Chenchen Song

Since no distributional criterion can tell us *hand*, *shǒu*, and *se* are nouns, the label N can only come from the impression that they are usually used as nouns elsewhere (the same is true for the other VIM labels). However, such impressionbased categorization is unreliable, because the same form may be reused in different categories, e.g. *a hand*<sup>N</sup> vs. *to hand*<sup>V</sup> *in the essay*. The invariant part here is the Root √hand rather than its categorized products.

Third, some VIMs do not fall in any existing lexical category, such as the prefixes in *re-build*, *un-fold*, *dis-close*, *mis-understand*, etc. They perform the same "adverbial" function as the other VIMs we have seen but cannot be categorized by impression. Similarly, in some Japanese V-V compounds, the first component is so bleached<sup>5</sup> that its assumed category V becomes vague, as in (4).

(4) *sashi*-semaru 'put-come.close; be imminent', *tori/tott*-tsuku 'take-attach; cling to, be obsessed', *hin*-mageru 'pull-bend; bend, distort', *butt*-taosu 'hit-topple; violently topple'…

According to Kageyama (1993), these italicized forms have become intensifying prefixes. Like English *re-*, *un-*, etc., they cannot be classified into any category.

In sum, if we want to identify a unified syntactic category for VIMs, the ordinary lexical categories are not a good place to look; the more plausible place is their functionality instead. That is, albeit counterintuitive, VIMs may form a functional category. This said, however, they are not inflectional, because canonical inflectional categories are closed classes, often with dedicated exponents, e.g. *-ed* for past tense. Being an open class with no fixed exponents, VIMs are again more like lexical categories.

This categorial status is reminiscent of the functionally "recycled" lexical items in Biberauer (2016a, 2017). According to Biberauer, recycling effects such as grammaticalization and multifunctionality are a distinctive property of natural languages, reflecting the domain-general third factor maximize minimal means. I illustrate this point with Chinese light verbs (5) and classifiers (6) (see Biberauer's works for more cross-linguistic examples).

	- b. *bǎ*-zhù fúshǒu 'hold-still handrail; to firmly hold the handrail' vs. *bǎ*-shū dǎ-kāi 'disp-book hit-be.open; to open the book'
	- b. shuǐ-*bēi* 'water-glass' vs. yì *bēi* shuǐ 'one clf water; a glass of water'

<sup>5</sup>The bleaching is not only semantic but also phonological, e.g. *tott*<*tori*, *hin*<*hiki*, *butt*<*buchi*.

### 17 Categorizing verb-internal modifiers

Light verbs and classifiers have lexical origins, and they still keep much idiosyncrasy as function words, as evidenced by the numerous same-function items in (7) which are nonetheless non-interchangeable.

	- b. Disposal light verbs: *bǎ* 'hold', *jiāng* 'lead, support', *guǎn* 'manage' …
	- c. Classifiers: *běn* 'for books', *bēi* 'for liquid in glass', *tóu* 'for animals' …

Similar flexibility exists in other languages, e.g. there are at least four productive light verbs in English: *do*, *take*, *make*, and *have*. The cross-linguistic widespreadness of semi-functional items implies some basic generative strategy. Biberauer (2016b: 5) identifies this strategy as adjoining featurally underspecified elements (effectively Roots) to null functional heads.<sup>6</sup> Following this idea, the functional heads behind light verbs and classifiers are Larsonian VP-shells (e.g. Voice, Appl, cf. Lohndal 2014) and Cl (Borer 2005; Feng 2015; Huang 2015). By comparison, the head H behind VIMs is much less clear-cut. It cannot simply be VIM, for that would entail an ad hoc formal feature (FF) [VIM] which makes little sense in our feature system.<sup>7</sup> Nor can it be any VP-shell category, because on the one hand, VIMs are inside the complex verbs rather than above VP; on the other hand, while VP-shells and Cl only recycle from V and N sources respectively (in line with Roberts & Roussou's 2003 observation that grammaticalization is always upwards in a functional hierarchy), H can recycle from any contentful morpheme without categorial restriction, which makes the process more like *lexicalization*, with H systematically converting various concepts into lexical items, just like categorizers. This effectively bears out the DM view that the non-heads of primary compounds are bare Roots (cf. de Belder 2017), though I deviate from (almost all) previous DM approaches to compounding from RootP incorporation (e.g. Harley 2009) to Root–Root merger (e.g. Zhang 2007; Bauke 2016), for reasons to be spelled out in §4.

In fact, since the VIM is merged as a non-complement non-projecting sister of V, it is essentially a V-adjunct, which means H, if existent, systematically creates head adjunction. As such, a proper syntactic model of VIMs relies on an adequate theory of adjunction. I briefly review theories of adjunction in the next section.

<sup>6</sup>This idea deviates from DM. First, it relies on a conception of Root broader than that in DM (but closer to that in Borer 2013), for not only lexical but also functional forms can be recycled (e.g. *that*D/C). Second, it violates the DM assumption that Roots cannot appear without being categorized by one of the category defining heads (the categorization assumption, Embick & Marantz 2008; see Song 2017c for a less restrictive version compatible with Biberauer's idea).

<sup>7</sup>According to Zeijlstra (2008) and Biberauer (2016b, 2017), FFs piggyback on substantive features, so [Person] and [Gender] are legitimate FFs while [Affix] and [Complement] are not.

Chenchen Song

## **3 Minimalist approaches to adjunction**

### **3.1 Pair Merge**

One may wonder: if VIMs are adjuncts, why do we need to give them any functional head at all? Shouldn't their modifier role be self-evident? These questions implicitly take adjunction and its asymmetric effect for granted, which is undesirable given the (beyond-)explanatory goal of the minimalist program.

The standard minimalist approach to adjunction is Pair Merge (Chomsky 2000, 2004), which takes two syntactic objects α, β and yields an ordered pair 〈α, β〉. α (the adjunct) is attached to β from a separate plane, which is invisible to and thus cannot interfere with the primary-plane derivation. Following this idea, adjunction does not need any functional head but is a special operation. However, Pair Merge sacrifices the minimalist and evolutionary advantages of the theory, because, as Collins (2017: 52) points out, it has to be stipulated as an independent UG operation, which goes against the strong Minimalist thesis (SMT, "language keeps to the simplest recursive operation", Berwick & Chomsky 2016: 71). Chomsky (2013: 40) also criticizes the "extension of Merge", arguing that there is no remerge, multidominance or late Merge (among others), but only simple Merge.

Also note that the motivation of Pair Merge is empirical ("it is an empirical fact that there is also an asymmetric operation of adjunction", Chomsky 2004: 117), but its problem is conceptual. As such, if we could give the "empirical fact" an alternative explanation, Pair Merge would no longer be needed. I will discuss such alternatives in §3.2. For now, let's turn to another problem of Pair Merge, raised in Rubin (2003):

We need to avoid circularity here, so we cannot simply say that we want adjuncts to be adjuncts, so we invoke pair-Merge, which creates adjuncts. Before any two expressions are merged, relational terms such as *adjunct*, *complement*, and *specifier* are premature. (Rubin 2003: 663)

The problem is essentially how syntax can determine Pair Merge is appropriate for adjuncts. Rubin's solution is a dedicated functional head Mod, which "forms an extended projection around all base adjuncts" such that "[a]ny phrase headed by Mod is subject to pair-Merge" (p. 664). This idea is not so different from our functional head H in §2 and also compatible with the Borer–Chomsky conjecture (BCC, Baker 2008), which highlights the fundamental role of features. However, the solution is not optimal. First, as Arsenijević & Sio (2009: 2) notice, when Mod connects a modifier to a noun (both are phrases in Rubin 2003), it selects twice – first the modifier and then the noun, as in (8) – but Pair Merge only happens in the second selection, which makes the triggering effect of Mod inconsistent.

### 17 Categorizing verb-internal modifiers

Second and more relevant to us, Mod has no substantive featural basis. Though Rubin (2003: 666) specifies its semantic type as 〈〈e,t〉,〈〈e,t〉,〈e,t〉〉〉 ("a function from predicates to properties of predicates"), this only describes the function we want Mod to perform but does not relate it to any conceptual interpretation. So Chomsky's (1995) criticism of Agr (that it is present only for theory-internal reasons) also applies to Mod. The above two problems may not be unsurmountable, but they do show that Rubin's intuition can be further developed.

### **3.2 Labeling**

While Rubin (2003) "determines" Pair Merge and justifies its role in adjunction, Hornstein (2009) and Oseki (2015) dispense with it and derive adjunction via Simplest Merge (Hornstein's "concatenate") plus labeling. Following Epstein et al. (2012), Oseki (2015) assumes when two phrases XP and YP merge but share no feature, the merger cannot be labeled. Adopting the label accessibility condition (LAC, Hornstein 2009: 90, Epstein et al. 2012: 254),<sup>8</sup> which states that unlabeled syntactic objects cannot be accessed by Merge,<sup>9</sup> Oseki further claims that at this stage the derivation can only proceed by letting one of XP and YP participate in further Merge, thus yielding the "two-peaked" structure in (9). In Hornstein's terms, YP "dangles off" the [ZP Z XP] complex.

Epstein et al. (2012: 261) conceive this structure as "two intersecting set-theoretic SOs". Crucially, one peak must be removed (via Transfer) from the narrow syntax, which then becomes inaccessible to later derivation, rendering the island effect.<sup>10</sup>

<sup>8</sup>Epstein et al. (2012: 262) deduce LAC from minimal search and conceive it as a third factor consequence in the sense of Chomsky (2005).

<sup>9</sup>This view is not unanimous, e.g. for Chomsky (2013) labels are only needed by the interfaces. As such, the indispensability of LAC in Epstein et al.'s model may turn out to be a disadvantage.

<sup>10</sup>Epstein et al.'s main focus is the Spec-TP subject. Oseki extends their model to adjuncts.

### Chenchen Song

Several issues remain unclear. First, the definition of "peak" is vague. Geometrically, a peak consists of two branches, but then removing a peak amounts to removing an entire {XP, YP}, which means XP cannot stay in syntax to merge again. Second, even if XP could stay, since Transfer cannot undo Merge, the removal of YP cannot save the second merger of XP from violating *no tampering*, and since the intersected element is contained in two sets, set intersection inevitably leads to multidominance. Third, for Epstein et al. the removed peak is consistently the phase-head-complement, but this causes trouble for Oseki, as it wrongly predicts that adjuncts are only ever adjoined to phase heads.

While the two-peaked model is far from ideal, the labeling idea behind it is indeed more advantageous than Pair Merge: (i) it obeys the SMT and is evolutionfriendly, (ii) it reduces the specialness of adjunction to specific features, in line with the BCC. Remember that Rubin's (2003) idea was also to reduce adjunction to a specific category, which makes it potentially compatible with a labelingbased approach. Thus, instead of resorting to "unlabelable" scenarios (e.g. the two-peaked model), we could also seek a solution from scenarios where labeling can normally proceed (as in Rubin's model). I propose a new model along this line in §4.

### **3.3 Interim summary**

To recapitulate §§1–3, the structure of verb-internal modifiers (V-level adjuncts) is a tricky issue in syntactic approaches to word-formation, partly due to the elusive categorial status of VIMs and partly due to the unavailability of a satisfactory theory of adjunction. The two problems point to the need of a categorial account of adjunction, e.g. via a mediating functional head H. As such, among previous approaches to adjunction, those based on labeling (manipulating categories) are more advantageous than those based on Pair Merge (a specialized UG operation). In addition, among potential labeling-based theories those featuring "labelable" scenarios are more coherent than those featuring "unlabelable" ones.

## **4 Deriving verb-internal modifiers**

### **4.1 How** *not* **to merge a Root**

As mentioned in §2, the relation between H and VIMs is similar to that between categorizers and Roots. Note that I did not prove the necessity of H, but only speculated that it could potentially replace Rubin's (2003) Mod. Two points could make this speculation suspicious. First, labeling (essentially minimal search) does

### 17 Categorizing verb-internal modifiers

not need any special head to proceed. FFs on the Merge input alone are enough. Second, if VIMs are Roots, then H can be nothing but a categorizer (à la categorization assumption, cf. footnote 6), which leads to a dilemma, for no existing lexical category is adequate for VIMs.<sup>11</sup>

This dilemma is faced by all models applying ordinary categorizers to compound non-heads (e.g. Harley 2009 for compound nouns), but it does not force us to resort to uncategorized "floating Roots" (e.g. de Belder & van Koppen 2016) or postsyntactic Root operations (e.g. fission, de Belder 2017), especially if those solutions rely on unwarranted definitional extension of Root, which is no more desirable than extension of Merge. Below I will defend the conservative position that Roots are bare (FF-less), syntactically inert (no √P), and must be categorized.

To begin with, the bare Root view is faithful to the original purpose of the Root theory, i.e. lexical decomposition.<sup>12</sup> Lexical decomposition targets non-primitive lexical items (LIs) and submits that any composite LI, be it a pure FF bundle or an FF-equipped Root, has to be assembled from smaller atoms rather than appearing as such all of a sudden. This is evidenced in language acquisition/change, where feature bundles are gradually formed and further alterable.<sup>13</sup> To wit, any theory working with bundled features has to assume some LI forming mechanism, including DM.<sup>14</sup> However, as Collins (2017) remarks, this poses a conceptual problem, because "that mechanism is not Merge":

This state of affairs seems undesirable for two reasons. First, humans have an unlimited capacity to learn and to coin new lexical items, just like they have an unlimited capacity to form new phrases […] Second, adding a new mechanism (to form lexical items) would increase the complexity of UG, going against the SMT. (Collins 2017: 61)

Collins concludes that LIs are formed by Merge. So, FF-equipped Roots, if any, must also be products of Merge, which takes bare Roots and FFs as input. In short, the single engine hypothesis and SMT together force a bare Root view.

<sup>11</sup>Similar considerations led de Belder & van Koppen (2016) to conclude that the non-heads of some Dutch compound nouns are bare Roots without any functional category, not even categorizers.

<sup>12</sup>See Ramchand (2008: 11) and Gallego (2014: 192) for summaries of various Root views.

<sup>13</sup>Despite the intuition that we use LIs as whole units, the existence of sub-LI knowledge has never been denied (hence the branch "morphology") – it has simply been handled by a separate generative engine (the lexicon). In this sense, lexical decomposition is not introducing anything new but merely aims to capture the sub-LI knowledge in the single-engine framework.

<sup>14</sup>Marantz (1997: 203) conceives the DM narrow lexicon as "generative", as it contains "atomic bundles of grammatical features [that are] freely formed, subject to principles of formation".

### Chenchen Song

Following this line of thought, if Roots are stored bare presyntactically,<sup>15</sup> they must be inert in narrow syntax, which only manipulates FFs. Among others, this means Roots cannot head or project/label, hence no √P (in line with i.a. Acquaviva 2009, Borer 2009, 2014, Chomsky 2013, de Belder 2011 et seq., Alexiadou 2014; contra Cuervo 2014, Harley 2014). Moreover, since no featural dependency could ever be established on Root nodes, nor could they be moved or host movement,<sup>16</sup> hence there can be no Root incorporation (contra Harley 2009). The only way a Root may participate in syntactic derivation is via categorization, either exclusively by the lexical categorizers (as in standard DM) or by any functional category (as in Borer 2005; 2013; Biberauer 2016b; Song 2017c). What matters here is that there can be no floating Root, i.e. every Root must be the most deeply embedded element in its workspace (a conclusion compatible with Marantz 2001 and Boeckx 2014). As such, the model in (10a) is infelicitous, for it is impossible to categorize the VIM Root without letting it project (10b) or remerge (10c).<sup>17</sup>

Note that (10c) is the two-peaked structure in §3.2. Despite its infelicity, the idea that √vim may be categorized in adjunction is insightful. Therefore, if we could overcome the multidominance problem, (10c) may well become a felicitous model.

<sup>15</sup>This does not rule out the possibility that non-bare Roots (just like other composite LIs and even larger phrases) could be lexicalized and stored postsyntactically (in DM lists 2 and 3) or extra-syntactically (as general experience, cf. Marantz 2013).

<sup>16</sup>Thus, Roots may be conceived as adjuncts (à la Marantz 2013).

<sup>17</sup>Strictly speaking, √vim can only be categorized via the multidominance structure in (10c), because in (10b) what the upper *v* categorizes is √P rather than √vim. Besides, (10b) wrongly predicts VIMs can only be V-origin.

### 17 Categorizing verb-internal modifiers

I will further pursue this route in §4.2. For now, let's turn to another infelicitous structure in (11).

This is the compounding model adopted in i.a. Zhang (2007), Borer (2013), Bauke (2014; 2016) and de Belder & van Koppen (2016). A clear problem with it is the symmetric relation between the two Roots, which means there is no way to determine which Root is the modifier and which is the verb at logical form (LF), nor can they be algorithmically linearized at PF. Borer (2013) resorts to Root incorporation to yield the asymmetry, but this operation is illegitimate under the bare Root view, as FF-less objects cannot be moved.<sup>18</sup> For more thorough arguments against direct Root-Root merger see Song (2017c).

With (10–11) ruled out, we are left with only one structure to derive VIMs, i.e. [H √vim]-[*v* √verb], where the two Roots are separately licensed before being joined together. The necessity of a functional head H is thus proved, not by the requirement of labeling but by the nature of Root.

### **4.2 Defective categorizer**

Further examination of the structure [H √vim]-[*v* √verb] reveals that H and *v* must share some feature(s), for otherwise the structure is unlabelable.<sup>19</sup> However, H cannot simply be *v*, because that would make the structure symmetric just like (11) and its two branches formally undistinguishable (distinctness is an important interface principle, cf. Richards 2010). Rather, H and *v* should be simultaneously homogeneous and non-identical, and ideally the distinction should not be achieved by bundling extra features into H/*v*, for that would go against the

<sup>18</sup>De Belder (2017) proposes a fission-based variant of (11), where the two Root nodes are "split" postsyntactically and the asymmetry is yielded by "the order of insertion". I do not have space to evaluate this approach, but *ceteris paribus* the model I will propose later is free from postsyntactic operations and thus potentially more parsimonious.

<sup>19</sup>Here I follow Chomsky's (2013, 2015) conception that all branching nodes (i.e. all products of Merge) must be equipped with a label at the interfaces. See Bošković (2016), Bauke & Roeper (2017) for looser positions and Collins (2002 et seq.) for a label-free system.

### Chenchen Song

spirit of lexical decomposition. Remember that in §2 H was likened to categorizers, and that in §4.1 the ordinary DM categorizers were ruled out. As such, a simple hypothesis about H is that it is a special categorizer.

To identify H, therefore, we need a better understanding of categorizers and their place in the inventory of functional categories. A first point to note is that terms like "categorizer", "categorial feature", and "categoryless" are used loosely in the literature, because if items without a categorizer are categoryless, then categoryless items would include not only Roots, but also T, Asp, Num, etc. In other words, if categorial features (largely limited to [N], [V], [A]) are what define categories (as the term literally suggests), then various functional categories would end up being non-categories. Obviously, these are not what DM is expected to predict; what the above terms really mean are "lexical categorizer", "lexical categorial features", and "lexical-category-less" instead. So, our mission is to identify a special *lexical* category.

Despite their intuitive straightforwardness, lexical categories are a notoriously disputed area in minimalism. As content words are decomposed into categorizers and Roots, the previously held lexical categories become functional in nature. However, "lexical", "noun", "verb", etc. do not follow the nomenclature of functional categories (FF-based, piggybacking on substantive features, cf. note 7) and need to be either renamed or redefined. Two representative approaches exist in this regard. Borer (2005) denies the existence of dedicated categorizers and treats traditional lexical categories as distributional contrasts that are only definable as "categorial complement spaces" of functional projection series, e.g. D-Num-Cl is "nominal" while C-T-Voice is "verbal" (Biberauer 2016b has a similar view). On the other hand, Panagiotidis (2015, 2017) endows the categorial features [N] and [V] with interface substantiveness, letting them represent two "fundamental interpretive perspectives" (FIPs) – "sortality" and "extending into time":

Sortality will have to be associated with *individuation*, number, quantification etc. — realised as functional categories Number, Determiner etc. "Extending into time" will be the seed of events and causation, and will require event participants, a way to encode length of event and relation between time intervals etc. – realised as an event projection / argument, Voice, Aspect, Tense. (Panagiotidis 2017: lecture 1, p. 4)

The two approaches are not necessarily incompatible. Considering many conventional labels have turned out to be mixtures of heterogeneous concepts (e.g. IP/CP are extended domains, MergeMP = MergePoP + labeling20), lexical categorial labels like "noun" and "verb" may also have multiple dimensions that could

<sup>20</sup>MP = *Minimalist program* (Chomsky 1995), PoP = *problems of projection* (Chomsky 2013).

### 17 Categorizing verb-internal modifiers

(and should) be unbundled. Specifically, we can conceive "noun", "verb", etc. as distributional patterns following Borer while having an FIP-introducing functional layer in each pattern following Panagiotidis. This layer may be identified as the "categorizer" but is not really the original DM categorizer, for it does not turn a Root into a conventional noun/verb but merely turns it into an FIP-bearing item. Other nominal/verbal properties (e.g. referentiality, argument structure) are introduced by additional functional layers in later derivation. Featurally speaking, the FIP-introducer is not so different from other functional heads such as T and Gen in that they are all FF-based<sup>21</sup> and interface-motivated, as in Table 17.1.



Following Adger & Svenonius (2011), a valued feature is a pair of attribute and value 〈att, val〉 – or [att:val] in more popular notation – which may be a UGgiven template (in the sense of Biberauer 2016b). The attribute is a feature class (i.e. a subset of all features) and the value a feature belonging to that class. Thus, [N] and [V] are more precisely [FIP: sortal/ext-in-time] (henceforth [FIP: N/V] for expository convenience), similar to [T: pres/past]. Adger & Svenonius argue that since the feature classes themselves can be referred to by rules or principles (e.g. agreement copies φ-features), they are grammatically active independently of concrete values. This means there can be valueless attributes – an unsurprising conclusion given the fundamental syntactic role played by unvalued features, or more exactly feature classes (the term "feature" is variably applied to features and feature classes, Adger & Svenonius 2011: 35).

Previous discussions of unvalued features are largely limited to "parasitic" ones, i.e. unvalued features bundled on heads defined by valued features, such as [*u*T] on V and [*u*φ] on T. But in the context of lexical decomposition, there may well be standalone unvalued features making up their own heads.<sup>22</sup> I postulate an unvalued FIP-introducer, consisting of a single [*u*FIP] feature (more vividly [FIP: ]), which declares an FIP interpretation but leaves its value underspecified.

<sup>21</sup>Strictly speaking, there can be no non-FF-based differences among functional heads.

<sup>22</sup>In fact, this is the only possibility if Collins (2017) is on the right track. Unvalued and valued features can still be bundled, but that can only be done in syntax via Merge (cf. §3.1).

### Chenchen Song

Assuming the lexically valued [FIP: N] and [FIP: V] correspond to the ordinary categorizers *n* and *v*, we may call the unvalued FIP-introducer a "defective categorizer" (Cat for short).

### **4.3 Cat and verb-internal modifier**

Cat counts as a non-ordinary lexical categorizer in that it is lexically unvalued. As a result, the Root material it introduces has no concrete FIP interpretation and appears categoryless. This is precisely what we need from H in [H √vim]-[*v* √verb], so I identify H as Cat. In this section, I will show how Cat derives VIM.

I adopt the following theoretical assumptions. First, categorizers (however defined) are phase heads (à la Marantz 2001). But unlike Chomskyan *v*\*P (though maybe like CP), the categorizer phase is spelled out as whole, including both the Root and the categorizer. This is because the Root cannot be properly interpreted without the categorizer. Second, spelled-out constituents do not necessarily vanish from the syntax. Some (e.g. complex "satellites" like specifiers/adjuncts) leave their labels behind as "bookmarks" that behave as terminal nodes (X<sup>0</sup> s) for linearization purpose (Nunes & Uriagereka 2000; Fowlie 2013). Third, the bookmarkish "new" lexical items may be derived by spellout plus "renumeration" (Johnson 2003). That is, satellite substructures may be separately derived (perhaps via lexical subarrays in separate workspaces), labeled, and put back in the numeration, so that they can participate in the next cycle of derivation. With these technical devices, we can now derive modificational compound verbs.

To begin with, Cat and *v* separately categorize a Root. Since the Roots are not lexically marked as VIM or V, I simply write them numerically as √1 and √2.

	- b. Merge Cat and √1. LA is exhausted. Transfer.
	- c. Since the Root is FF-less, Cat labels {Cat, √1} as Cat (featurally [*u*FIP]).
	- d. Renumerate the Root-supported Cat (notated as Cat<sup>√</sup> ).
	- e. Repeat steps a-d for *v* and √2.

After (12), the numeration contains the two "recycled" lexical items Cat<sup>√</sup> and V<sup>√</sup> . This is the end of word-internal derivation and the beginning of the Chomskyan derivation, where lexical items are equipped with categorial information.

Then, Cat<sup>√</sup> and V<sup>√</sup> are selected into another lexical subarray LA together with other *v*\*P-phase items and merged accordingly. Upon the next Transfer, the unvalued FIP feature on Cat probes for a value and finds one on V. It is thus valued

### 17 Categorizing verb-internal modifiers

via Agree, and the Cat<sup>√</sup> -V<sup>√</sup> merger is labeled as V by feature sharing (Chomsky 2013), as in (13a).<sup>23</sup> See (13b) for a concrete example.<sup>24</sup>

Suppose the system can distinguish intrinsically valued features from features valued via Agree,<sup>25</sup> there would be a derivational asymmetry between Cat<sup>√</sup> and V√ , with the former's interpretation depending on the latter's. This dependency may be reflected in semantics as variable sharing, which I briefly illustrate below.

Under the bare Root view, I assume that the denotation of a Root is radically underspecified, to the extent that it is not only grammatically void, but also does not make a complete function. Instead, a Root merely denotes a vague property – a "function template" whose domain (including variable type) is not yet defined, as in (14a). This information is only added when the Root is categorized, as in (14b).

(14) a. ⟦√wash⟧ = .wash( )

= 'encyclopedically related to wash and compositionally '

b. ⟦[*v* √wash]⟧ = .wash() = 'encyclopedically related to wash and compositionally an extending-into-time FIP (i.e. an event)'

<sup>23</sup>I remain agnostic as to whether feature sharing in labeling is the same mechanism as that in agreement as proposed in i.a. Frampton & Gutmann (2000, 2006) and Haug & Nikitina (2016).

<sup>24</sup>I assume the pairing of Roots and categorizers to be a matter of pre-linguistic planning. As Chomsky (1995: 227) remarks, there is "no meaningful question as to why one numeration is formed rather than another". What matters here is merely that each LA only contain one Root.

<sup>25</sup>I leave aside the technical details, but any adequate theory would be compatible. See Rooryck & Vanden Wyngaerd (2011: 10) for a proposal based on feature sharing.

### Chenchen Song

The event variable in (14b), which defines eventuality, is introduced by the verbalizer (cf. Marantz 2013). Since the verbalizer is featurally [FIP:V], is presumably encoded in the value [V]. More generally, I assume all variable types to be functionally introduced rather than an inherent part of the Root. Being valueless, Cat does not introduce any variable type, though it does endow the Root with interface interpretability (as an FIP).<sup>26</sup> So, Cat<sup>√</sup> has the denotation in (15).

	- = 'encyclopedically related to hand and compositionally a FIP'

After Agree, Cat<sup>√</sup> is equipped with the event variable introduced by *v*. However, since the categorial interpretation of √1 has been fixed in the previous spell-out cycle, the newly obtained can no longer turn √1 into an independent event, but only connects it to another event, i.e. that denoted by V<sup>√</sup> . As such, √1 effectively becomes a modifier of V<sup>√</sup> , as in (16).

(16) a. ⟦(13a)⟧ = .1() ∧ .2()

= 'encyclopedically related to 1 and compositionally connected to an event' ∧ 'encyclopedically related to 2 and compositionally an event' = 'an event of 2, encyclopedically related to 1' (event identification)

b. ⟦(13b)⟧ = .hand() ∧ .wash() = 'encyclopedically related to hand and compositionally connected to an event' ∧ 'encyclopedically related to wash and compositionally an event' = 'an event of washing, encyclopedically related to hand'

Since {Cat<sup>√</sup> , V<sup>√</sup> } and V<sup>√</sup> have identical labels, Cat<sup>√</sup> is in effect an adjunct. Since Cat<sup>√</sup> is dominated by V, it is verb-internal. The modificational compound is thus derived solely by Simplest Merge and labeling, with no need of Pair Merge, Root incorporation, postsyntactic operation or multidominance. In effect, the structure in (13) unifies two Roots under one ordinary categorizer without violating the DM tenet that one categorizer can only categorize one Root (cf. Embick 2010).

<sup>26</sup>A consequence of the single engine hypothesis is that unvalued features must not be deleted by the end of the categorizer phase (i.e. when the categorized Roots are renumerated), because they are still required in the Chomskyan numeration and the next cycle of derivation. I merely acknowledge this point but do not attempt to account for it in this study.

17 Categorizing verb-internal modifiers

## **5 Some implications**

### **5.1 Noun-internal modifiers**

In §4, I illustrated how VIMs are derived by Cat, but the application of the defective categorizer hypothesis is not confined to the verbal domain. In fact, since all Cat<sup>√</sup> needs is an FIP value, it may well be merged with a noun and become a noun-internal modifier (NIM). While leaving NIMs to future research, in (17) I illustrate the flexibility of Cat by items that can be used as both VIM and NIM.

(17) a. English

hand[FIP:V]-wash[FIP:V] vs. hand[FIP:N]-gel[FIP:N], sleep [FIP:V]-walk[FIP:V] vs. sleep[FIP:N]-mode[FIP:N], breast[FIP:V]-feed[FIP:V] vs. breast[FIP:N]-bone[FIP:N]


se[FIP:V]-ou[FIP:V] 'back-carry; to carry on back' vs. se[FIP:N]-bone[FIP:N] 'back-bone', oshi[FIP:V]-taosu[FIP:V] 'push-topple; to push down' vs. oshi[FIP:N]-bana[FIP:N] 'push-flower; pressed flower'

The Cat-licensed Roots √hand, √sleep, √breast, etc. have no fixed FIP interpretation – they become VIMs when merging with V<sup>√</sup> s and NIM when merging with N<sup>√</sup> s. Admittedly, whether or not a specific Cat-item has both verbal and nominal uses is a matter of language-specific lexicalization, e.g. while all of *handwash*, *hand-gel*, and *foot-gel* are fine in English, there is no ?*foot-wash* ('wash with foot') by the time this chapter is written (though it could easily be coined). The defective categorizer hypothesis does not aim to predict which VIMs/NIMs actually exist in a certain language, but merely captures the capacity of human beings to create such language units.

### **5.2 Universality of compounding**

The proposed theory can not only be extended to the nominal domain but also predict the widespreadness of modificational compounds. Given the mutual dependence between feature classes (attributes) and features (values), the FIP class

### Chenchen Song

(i.e. the set of FIP values) should be as widespread as its values. Moreover, if "no value" can be conceived as the empty set, i.e. [F: ] = [F: ∅], then unvalued features are in effect free-riders of their valued counterparts, for the empty set is a subset member of all sets, which include all feature classes (conceived as subsets of all features, cf. §3.2). This means any language with at least one FIP value also has a grammatically active defective categorizer. In other words, modificational compounding as a generative mechanism is as widespread in human languages as conventional lexical categories, i.e. universal (cf. Baker 2003; Panagiotidis 2015).

This conclusion is supported by typological studies. According to Bauer (2009: 344), (modificational) compounding has been suggested to be a language universal (Fromkin et al. 1996: 54–55; Libben 2006: 2), as evidenced by language acquisition (Clark 1993) and contact (Plag 2006). A caveat here is that universality may be masked by varied terminology and classification in descriptive grammars. For example, descriptions of Ainu (e.g. Refsing 1986; Shibatani 1990) do not mention compounding at all, though the language does have de facto compounds, as in (18a). Similarly, Evenki has also been claimed to lack compounds (Nedjalkov 1997: 308), but a quick look into alternative sources reveals many of them, as in (18b).

	- b. Evenki (Tungusic; cf. Hu & Chao 1986) eyji shee 'brick tea', aaxin jolo 'liver stone; marble', unaaji ute 'girl son; daughter'

### **5.3 Compound verb typology**

Despite the universality of modificational compounds, compound nouns are cross-linguistically a lot more common than compound verbs. Take the familiar European languages for example: while modificational compound nouns exist in all of Germanic, Romance, and Slavic languages (cf. Bauer 2009), compound verbs like *hand-wash* are only seen in English with some productivity. One might take this to be an areal phenomenon, for compound verbs are more widely used in e.g. East Asia. However, as Bauer (2009: 355) comments, the areal preferences are not clearly correlated "with anything linguistic in the appropriate languages".

The defective categorizer hypothesis provides a new perspective on modeling this unbalanced typology. Since the node dominating [Cat<sup>√</sup> V<sup>√</sup> ] (call it 27) has

<sup>27</sup>Similar to Booij's (1990) V\*, which is more than V<sup>0</sup> but less than V′ (cf. Vikner 2005).

### 17 Categorizing verb-internal modifiers

exactly the same label as the V<sup>√</sup> node, operations targeting one node also target the other. As a result, in languages requiring V-to-T/C movement, the T/C probe is unable to access the real V<sup>0</sup> (i.e. V<sup>√</sup> , which becomes a terminal lexical item after renumeration, cf. §4.3) due to the intervening , as in (19). This is presumably a minimal search effect, as formulated in the minimal link condition (20).

(20) Minimal link condition (Chomsky 1995: 311): K attracts α only if there is no β, β closer to K than α, such that K attracts β.

In addition, since is not a minimal category (head) on the clausal spine, it cannot undergo head movement either (see 19). Therefore, in the end nothing moves to T/C, and the derivation crashes. This means Cat–V compound verbs are only well-formed in languages/contexts without verb movement requirement. So Romance languages, where V systematically moves to T (cf. Biberauer & Roberts 2010), are incompatible with such compound verbs. For instance, the concepts in (1) are expressed periphrastically in Spanish, as in (21).<sup>28</sup>


However, the prediction as such is too strong, for apart from V-to-T/C, there is also V-to-*v*\* (or more generally V-to-VP-shell) movement, e.g. in English (cf.

<sup>28</sup>Retrieved from oxforddictionaries.com and wordreference.com (29 Dec 2017).

### Chenchen Song

Roberts 2010, 2019). So, if Cat–V compounds and verb movement are totally complementary, then English becomes a major counterexample.

One possible solution lies in the design of Cat. Since it merely needs to merge with something that can provide it with an FIP value (and thus label the merger), which in the case of [V] is essentially an event variable, it can in theory merge with any -equipped head. In a neo-constructionist event structure (cf. Acedo-Matellán 2016), this may be any subevental head (e.g. Init/Proc/Res in Ramchand 2008) or argument introducing head (e.g. Voice/Appl in Pylkkänen 2008). Considering Internal Merge occurs at phase level (cf. Citko 2014), i.e. after all steps of External Merge in a phase are done, and the Cat<sup>√</sup> -V<sup>√</sup> merger is External Merge, here I make the conservative hypothesis that apart from the verbalizer, the next position Cat may attach to is the *v*\* phase head (whichever head that turns out to be in an elaborate verbal domain). Crucially, since Cat only merges in after all steps of Internal Merge in *v*\*P are done (i.e. as part of the next phase), Cat– V (more exactly Cat-*v*\*) compounds may well exist in a language with V-to-*v*\* movement. In sum, we can have a three-way typology of Cat–V compounds (and VIMs) regulated by the verb movement parameter, as in Table 17.2.



Note that due to the inconsistent verb movement requirement, OV-Germanic languages may only have Cat–V compounds in non-V2 contexts, as in (22).

### (22) German (via Vikner 2005)


The compound verb *bau-sparen* 'building-save; to building-save' cannot appear in finite main clauses but is only well-formed in situ, either in a sentence with a

### 17 Categorizing verb-internal modifiers

modal verb (which fulfills the V2 requirement) or in a subordinate clause (where there is no V2 requirement). Germanic compounds like *bau-sparen* are known as "immobile verbs" (cf. i.a. McIntyre 2002; Vikner 2005; Ahlers 2010; Song 2016). They have a natural explanation in the current model.

As a final remark, the typology in Table 17.2 only concerns Cat–V compounds. So, on the one hand, Type I–II languages may still have unhindered Cat-N compounds/NIMs, e.g. French *homme grenouille* 'man-frog; frogman', Spanish *bocacalle* 'mouth-street; street intersection'. On the other hand, they may also have other types of complex verb in all contexts, such as particle verbs (including their inseparable variants), e.g. German *ein-kaufen* 'in-buy; to shop' (V-PP), *er-warten* 'er-wait; to expect' (*er*<OHG *ur* 'out'), Spanish *ex-traer* 'out-pull; to extract', and various phrasal verbs, e.g. French *mettre bas* 'put low; to give birth' (V+AP), Spanish *ponerse en camino* 'put.refl on way; to set off' (V+clitic+PP), German *Schwein haben* 'pig have; to be lucky' (V+NP). I do not discuss these other types of complex verb (more exactly complex predicate) but merely distinguish them from Cat–V compounds. To wit, items like *ein*, *bas*, and *Schwein* are base-generated as V-complements, i.e. VMs in the broad sense (cf. §1), but they are not VIMs.

## **6 Conclusion**

This chapter is a minimalist study of verb-internal modifiers (non-heads of modificational compound verbs). I have defended the position that compounding is a syntactic phenomenon based on the view that syntax is the only generative engine in the human language faculty. My main difference from previous syntactic models of compounding is that I have kept to the simplest definition of Merge (no Pair Merge or remerge) and the bare Root view (no RootP, Root-Root merger or Root incorporation), both of which are consequences of the SMT. Guided by the defective categorizer hypothesis, which is independently motivated in the minimalist feature system, I have derived VIMs in a labeling-based model. This new model not only avoids the conceptual problems in previous approaches, but also brings along a number of potential points of future research. First, it can be extended to the nominal domain and allows the same Root material to modify both verbs and nouns. Second, it predicts modificational compounding to be a language universal and relates the typology of Cat–V compounds to the verb movement parameter. In addition, beyond the verbalizer level, there may be further loci that Cat can attach to, e.g. the *v*\* phase head. As such, compounding is not only a natural part of syntax, but also sheds new light on "external" syntactic issues such as general head adjunction and phase-level modifiers.

### Chenchen Song

## **Abbreviations**


## **Acknowledgements**

This study forms part of my PhD project "Flexibility of syntactic categories: A cross-linguistic study" (funded by Cambridge Trust and China Scholarship Council). I am grateful to Ian Roberts, Theresa Biberauer, Anders Holmberg and Víctor Acedo-Matellán for discussions and encouragement. Thanks to the anonymous reviewer for helpful feedback, and to the audience at the Cambridge SyntaxLab (2 February 2016, 7 February 2017), the "Linguistic variation in the interaction between internal and external syntax" workshop (Utrecht, 8–9 Feb 2016), and the graduate conference WoSSP13 (Barcelona, 30 June–1 July 2016) for questions and comments on earlier versions of this work. All remaining errors are my own.

## **References**


17 Categorizing verb-internal modifiers


### Chenchen Song


17 Categorizing verb-internal modifiers


### Chenchen Song


17 Categorizing verb-internal modifiers


Refsing, Kirsten. 1986. *The Ainu language*. Aarhus: Aarhus University Press. Richards, Norvin. 2010. *Uttering trees*. Cambridge, MA: MIT Press.

### Chenchen Song


# **Chapter 18**

# **Rethinking split intransitivity**

## James Baker

University of Cambridge

This chapter presents a development of Perlmutter's (1978) unaccusative hypothesis. It argues that the verbal domain should be considered to comprise an ordered series of functional heads here termed the VISCO hierarchy, and that this approach permits an improvement understanding of split intransitive behaviours. The history of research into unaccusativity and split intransitivity is considered, with the strengths and weaknesses of the proposals made by Perlmutter (1978), Burzio (1986), Levin & Rappaport Hovav (1995) and others discussed and compared to the VISCO approach. The VISCO hierarchy is also compared to the hierarchy proposed in Ramchand (2008) and discussed in relation to the work of Sorace (2000). Issues such as difficulties in classifying unergatives/unaccusatives within a single language, apparent variation between languages, and the problem of syntax–semantics linking are all considered.

# **1 Introduction**

The purpose of this chapter is to present, in overview, a new approach to the phenomena of "split intransitivity" – phenomena where different sorts of intransitive predicates allow or disallow different syntactic behaviours. Specifically, I discuss this new approach in the context of a comparison to some of the major previous contributions in this area. Some strengths and weaknesses of these various existing approaches are critically evaluated, with arguments for how the new approach overcomes some of the weaknesses of those previous whilst retaining their important insights.

Examples of split intransitive phenomena include those presented in (1–4), from English, and (5–7), from other languages. In each case, different verbs exhibit different behaviours in relation to the constructions in question:

James Baker. 2020. Rethinking split intransitivity. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 385–420. Berlin: Language Science Press. DOI: 10.5281/ zenodo.3972868

	- a. i. The lollipops melted.
		- ii. Lucy melted the lollipops.
	- b. i. The window broke.
		- ii. Chris broke the window.
	- c. i. Harry coughed.
		- ii. \* Sarah coughed Harry. [intended meaning: 'Sarah made Harry cough.']
	- d. i. The pickpocket talked.
		- ii. \* The police talked the pickpocket. [intended meaning: 'The police made the pickpocket talk.']
	- a. the melted lollipops
	- b. the broken window
	- c. the recently arrived recruits
	- d. \* the coughed man
	- e. \* the talked pickpocket
	- f. \* the played cricketers
	- a. Lucy outtalked/outworked/outplayed/outswam/outran Chris.
	- b. \* Lucy outremained/outdied/outcame/outarrived Chris.
	- a. Lucy talked her way into the building.
	- b. Chris worked his way into the upper echelons of university administration.
	- c. Wayne played his way into the quarter-final.
	- d. \* Jessica died her way into the cemetery.
	- e. \* The train arrived its way into the station.
	- a. Hans Hans *ist* is gegangen. gone 'Hans went.'

18 Rethinking split intransitivity

	- a. Ne of-them arrivano arrive-3pl molti. many-m.pl '(Of them,) many arrived.' (Bentley 2004: 221)
	- b. \* Ne of-them studiano study-3pl molti. many-m.pl '(Of them,) many studied.' (Bentley 2004: 222)
	- a. Rezo Rezo.nom gamoizarda. he.grew.up 'Rezo grew up.' (Harris 1982: 293)
	- b. Nino-*m* Nino-erg daamtknara. she.yawned 'Nino yawned.' (Harris 1981: 147)

The new approach to syntactic structure proposed to account for these phenomena, here labelled the VISCO hierarchy, is presented in §2. In §3, I then compare the VISCO approach with previous approaches following Perlmutter's (1978) unaccusative hypothesis. I argue that the VISCO approach overcomes a number of the problems of its predecessors, though I shall also stress that it should be seen as a development of ideas already in the literature, not something in radical opposition to them. §4 concludes.

# **2 The VISCO hierarchy**

In Baker (2016; 2018; 2019) I posit variants on the following structure for the thematic domain (equivalent to *v*P), termed the "thematic functional hierarchy" (TFH) or the "VISCO hierarchy" after the initials of the five heads it comprises, see Figure 18.1. 1

<sup>1</sup>The reader may note similarities between the VISCO approach and that of Ramchand (2008). I compare the two approaches briefly here in §3.4.2, and in more detail in Baker (2018). Note that the variant of this hierarchy in Baker (2018), comprising a slightly different set of heads is called the "VICTR hierarchy".

James Baker

Figure 18.1: The thematic functional hierarchy

Arguments may be merged in the specifier positions of any of these heads and they gain their thematic interpretation from the positions in which they are merged. I describe an argument merged in Spec,VolitionP as bearing θ-volition, one merged in Spec,InitiationP as bearing θ-initiation, and so forth. A single argument may be merged in multiple positions and hence bear multiple "roles".<sup>2</sup> For example, in the sentence in Figure 18.2 *Lucy* (a volitional initiator undergoing a change of location) bears θ-volition + θ-initiation + θ-change. 3

The five VISCO heads are determined on the basis of the main features which Ihave deemed to be determinants of split intransitive behaviour in the languages I have studied in this regard: [±volition], [±initiation], [±state], [±change] and [±oriented]. (These languages include English, the Western European languages discussed by Sorace (2000), and various languages with "split-S" case and/or agreement systems, including particularly Basque and Georgian; see Baker 2016; 2018; 2019 for further discussion.) Encoding each of these features on separate heads is in line with the principle "one feature–one head" of the cartographic programme (see van Craenenbroeck 2009 and discussion in Baker 2018) and is also supported by evidence for the hierarchical ordering of the features (partially discussed here in §3.4.6; see Baker 2018 for more in-depth discussion).

<sup>2</sup>This is of course at odds with the traditional analysis of thematic roles and argument movement going back to the government and binding (GB) framework. In GB, arguments must have exactly one thematic role, which is assigned to them on the basis of its D-structure position (in minimalist terms, its first-merge position); movement to positions in which thematic roles may be assigned is barred. However, there seems to be no a priori reason why these principles should necessarily hold, and a minimalist grammar may reasonably reject them.

<sup>3</sup> In this and all subsequent trees I omit all structure outside of the thematic domain, and represent V only in its first-merge position.

### 18 Rethinking split intransitivity

### Lucy is coming.

Figure 18.2: An example of thematic role assignment

The volition head, which distinguishes whether an event is volitionally controlled or not – as opposed to initiation, which expresses causation independently of volition – allows us to capture behaviours such as the following (from Tibetan):


### James Baker

volition seems to be marginally active in split intransitive behaviours in English – note the following contrasts, where the [+volition] sentences are more strongly accepted with the diagnostics than the [−volition] ones:


### (10) a. [+volition]

	- i. ? Lucy trembled a tremble.
	- ii. ? Lucy skidded a skid.

The initiation and change heads capture for example the distinction made between [–initiation, +change] intransitives which allow the causative alternation in English, and [+initiation] or [–change] verbs which do not (an analysis modified from Ramchand 2008):

### (11) The causative alternation:


ii. \* The police talked the pickpocket. [intended meaning: 'the police made the pickpocket talk']

The change head, alongside state, further allows us to identify three classes of intransitives in English. [–state, –change] verbs allow constructions such as the following, which do not generally occur with [+change] verbs:

	- i. Lucy talked her way into the room.
	- ii. talker
	- b. [+change]
		- i. \* Lucy arrived her way into the room.
		- ii. \* melter, \*arriver

[+change] intransitives, on the other hand can generally occur as prenominal past participles, but [−change] intransitives do not:

	- i. the melted ice
	- ii. the recently arrived recruits
	- b. [−change]
		- i. \* the coughed man
		- ii. \* the talked professor

[+state] intransitives form a distinct class, allowing neither set of constructions:

	- b. \* stayer
	- c. \* the stayed man

This is evidence for the operation of the [±state] feature.

Finally, I employ the head labelled oriented to account for the distinction between (inherently) telic verbs like *arrive* and *tear* ([+oriented]) and atelic verbs like *melt*, *stay* and *talk* ([−oriented]). Only the latter readily occur with *for hours*:

	- i. \* Lucy arrived for hours.
	- ii. \* The cloth tore for hours.
	- b. [−oriented]
		- i. The ice melted for hours.
		- ii. Lucy stayed for hours.
		- iii. Chris talked for hours.

In the following section I compare the VISCO hierarchy approach to split intransitivity in relation to previous work on the topic.

### James Baker

## **3 The VISCO hierarchy and the unaccusative hypothesis**

### **3.1 Introduction**

In this section I discuss the VISCO hierarchy in relation specifically to the major existing approach to split intransitivity, the "unaccusative hypothesis", in its various forms. The unaccusative hypothesis was first introduced in Perlmutter (1978) and has been refined in much subsequent work. §3.2 overviews the unaccusative hypothesis as originally formulated. §3.3 identifies one major strength of the unaccusative hypothesis and considers how this is retained in the VISCO approach. §§3.4 and 3.5 then identify two important weaknesses of Perlmutter's original proposal, and discuss various attempts to overcome these – it is argued that these, in turn, have weaknesses which can be overcome in the VISCO model.

### **3.2 The origins of the unaccusative hypothesis**

It was Perlmutter's (1978) hugely influential article that first brought split intransitivity to the fore of discussion in generative linguistics. Working within the framework of relational grammar, Perlmutter formulated the following hypothesis:

(16) The unaccusative hypothesis (Perlmutter 1978: 160) "Certain intransitive clauses have an initial 2 but no initial 1."

"1" and "2" in relational grammar terms refer to primitives of grammatical relations. A "final 1" is a "surface subject"; a "final 2" a "surface direct object". In an ordinary active transitive sentence, the final 1 is also an "initial" 1, and the final 2 an "initial" 2. However, arguments may change relation between the initial and final levels ("strata"); hence for example in the passive the initial 2 is "advanced" to become a final 1 (the surface subject). The idea in (16), therefore, is that *certain intransitive clauses have an argument which bears the same relation as the direct object of transitive clauses*. As in the passive, however, this argument is advanced to the final 1/"surface subject" position, in accordance with the "final 1 law" which states that all clauses must have a final 1 (Perlmutter 1978: 160).

Perlmutter divided intransitive predicates into two groups, terming them "unergatives" (clauses with an initial 1) and "unaccusatives" (clauses with an initial 2). The basis of this division was semantic, though it was encoded in the syntax (see Levin & Rappaport Hovav 1995: 4–5). On Perlmutter's scheme, the division of intransitives into unergatives and unaccusatives was as follows (see Perlmutter 1978: 162–5 for fuller lists and discussion):

### 18 Rethinking split intransitivity

### (17) Unergatives


### (18) Unaccusatives


Perlmutter notes, however, that "alternative classifications are possible" (1978: 163).

Perlmutter's article advances the unaccusative hypothesis in order to explain the impersonal passive construction in languages like Dutch and Turkish. An example of this construction in Dutch is as follows:

(19) Dutch (adapted from Zaenen 1993: 131) Er there werd became hard hard gewerkt. worked 'There was hard work.'

The impersonal passive is, in effect, the passivisation of an intransitive clause. It is not, however, possible with all intransitives in the languages which allow it, for example (again from Dutch):

(20) Dutch (adapted from Zaenen 1993: 131) \* Er there werd became gebloed. bled 'There was bleeding.'

Perlmutter's idea is that the impersonal passive is possible with unergative clauses, but not unaccusative ones (for details of the mechanics of this, see that article).

### James Baker

Other research produced at about this time connected the unaccusative hypothesis to a number of other phenomena, such as pseudopassives (Perlmutter & Postal 1984: §6.3), auxiliary selection (Burzio 1981; 1986; Perlmutter 1989) and split intransitive case assignment (Perlmutter 1978: 165–166; Harris 1981).

Burzio (1981; 1986) reformulated the unaccusative hypothesis in government– binding terms. Under Burzio's approach, the argument of unergatives is an *external argument* whereas the argument of unaccusatives is an *internal argument*. In current minimalist terms, this is represented as follows, with the external argument first-merged in Spec,*v*P and the internal argument in the complement position of VP:

The unaccusative hypothesis as formulated by Perlmutter and Burzio has both strengths and weaknesses. These will be the focus of the next three subsections, discussed in relation to more recent explorations of split intransitivity including the VISCO approach.

### **3.3 The central insight of the unaccusative hypothesis**

In spite of various weaknesses to be discussed subsequently, a key strength of the unaccusative hypothesis in its original form (as put forward by Perlmutter 1978) is the connection of the phenomena it aims to explain to grammatical relations. This means that, rather than merely considering intransitives in isolation, parallels can be made with other types of clause. Thus, for example, the explanation of the impersonal passive is subsumed under a general explanation of the passive – it is possible only in clauses with an initial 1. These can be intransitive, as in (22), but also transitive, as in canonical examples of the passive such as the following (once more from Dutch):

18 Rethinking split intransitivity

(22) Dutch Ik I word become verslagen. beaten 'I am beaten.'

Similar parallels between intransitives and transitives, which can likewise be captured in terms of sensitivity to grammatical relations, can also be seen in many other split intransitive phenomena.

Under Burzio's (1981; 1986) reformulation of the unaccusative hypothesis, a variant of this insight is maintained in the following terms: that the status of a verb as unergative or unaccusative is directly related to the *position* of its argument in the syntactic structure (at D-structure, or in more recent terms firstmerge). This keeps the key strength of Perlmutter's analysis: the capturing of parallels between intransitive and transitive clauses.

This same insight is retained in the VISCO hierarchy approach to split intransitivity. Whilst the VISCO approach presents a more fine-grained approach to syntactic argument structure than Burzio and other traditional approaches, allowing for more than just two positions of intransitive arguments, split intransitive behaviours are connected to argument positions nevertheless, and consequently the approach is able to capture of parallels between intransitive and transitive clause types.

A couple of examples will serve to illustrate this. Firstly, agentive suffix *-er* generally describes the argument which in the equivalent clausal construction would be first-merged in Spec,InitiationP. This is the case both with transitive *destroy* (> *destroyer*) and intransitive *talk* (> *talker*), where in both cases it is a θ-initiation argument that is described. Secondly, the "undergoer" of a verb like *melt* occupies the Spec,ChangeP position whether the predicate is transitive or intransitive (see Figure 18.3). Similar parallels can be seen with other split intransitive diagnostic constructions (see Baker 2018; 2019).

Thus the VISCO approach maintains, in essence, the Burzio-type approach to understanding split intransitive behaviours, but combines it with a more finegrained understanding of syntactic structure. Some reasons for preferring this more fine-grained syntactic structure are presented in the next two subsections, which identify two particular kinds of problem with the traditional unaccusative hypothesis, which it is argued the VISCO approach is able to overcome.

### James Baker

Lucy melts the butter./The butter melts.

Figure 18.3: Thematic role assignment in the causative alternation

### **3.4 The problem of binary classification**

### **3.4.1 Introduction**

As noted, Perlmutter (1978) and much subsequent work divides intransitives into two main classes, unergatives and unaccusatives. This section will present various ways in which this binary classification proves to be problematic. It also discusses some suggested solutions, arguing that these have weaknesses but that these can be overcome by incorporating their insights into the VISCO model.

### **3.4.2 Ambiguity in classification criteria**

Given Perlmutter's criteria for distinguishing unergatives and unaccusatives given in (17–18) above, one issue arises with predicates that satisfy criteria from both classes. For example, volitional acts are supposed to be expressed by unergatives, but verbs like *fall*, *slide* and *disappear* are meant to be unaccusative.

### 18 Rethinking split intransitivity

What happens, then, when verbs in this latter set describe volitional events: a deliberate act of falling or sliding, for example?

Perlmutter discusses this sort of verb (1978: 163–164), considering oppositions such as the following:

(23) a. The wheels slid on the ice.

b. Joe slid into third base. (Perlmutter 1978: 163–164)

(23a) (non-volitional) is analysed as unambiguously unaccusative; Perlmutter suggests (23b) is either unergative on account of its volitionality or a biclausal causative – presumably something like the following, where the embedded clause is unaccusative like (23a) above:

(24) [Joe cause [slid Joe into third base]]

Implicit in the first suggestion is that the volitionality of a predicate might somehow "override" its unaccusative status and lead to it being classified as unergative, but this is not developed by Perlmutter. (24) is arguably an over-complex representation of the sentence and requires an analysis (likewise not provided by Perlmutter) of why the second *Joe*, or whatever element occupies that position, is not pronounced.

Unergative/unaccusative ambiguities like these lead Perlmutter to not classify certain classes of verbs at all: he mentions verbs of motion, presumably verbs like *go* and *arrive*, as amongst those he chooses not to categorise.

Ambiguities of classification have proven to be a continuing problem in the theory of split intransitivity. Ongoing research in the years following Perlmutter's (1978) article identified many so-called "mismatches", where the classes of unaccusatives and unergatives appeared to differ between languages – or where different purported diagnostics of unaccusativity *within* a language identified different classes. An important early work in this regard is Rosen (1984). Rosen shows, for example, that the verbs meaning 'to sweat' show unaccusative properties in Choctaw (occurrence with accusative pronouns) but unergative properties in Italian (occurrence with auxiliary 'have'):


I will now discuss some particular sorts of problems in unergative/unaccusative classification which can be observed to occur: firstly, where the unergative and

### James Baker

unaccusative classes in a given language appear to overlap on the basis of standard diagnostics (§3.4.3); secondly, where certain verbs cannot be reliably placed in either class according to the diagnostics (§3.4.4); thirdly (and relatedly), the problem of verbs which do not behave as expected in relation to the class to which they are supposed to belong (§3.4.5); and fourthly, the matter of crosslinguistic variation (§3.4.6). I will discuss some existing proposed solutions to these issues (where relevant), some problems with these solutions, and also the solutions which are possible in the VISCO approach.

### **3.4.3 Overlaps**

One problem with traditional approaches to unaccusativity occurs with apparent overlaps between unergative and unaccusative classes. This occurs, for example, when diagnostics of telicity are considered to diagnose unaccusativity – various authors have connected telicity to unaccusativity in various languages (such as Zaenen 1988, Borer 2005), including Schoorlemmer (2004: 227) for English. Certainly many "unaccusative" verbs do not readily allow "atelic" readings, as shown by their incompatibility with *for hours* in contexts like the following:

	- b. \* Chris died for hours.
	- c. \* The window broke for hours.

By contrast, all "unergative" verbs allow *for hours* in parallel contexts:

	- b. Chris swam for hours.
	- c. Harry played for hours.

However, many "unaccusative" verbs do allow *for hours* just as readily:

	- b. The wood burned for hours.

The class of verbs which allow *for hours* in this sort of sentence, then, overlaps with the classes identified as "unergative" and "unaccusative" by the other diagnostics. One way around the problem is simply to deny that telicity relates to unaccusativity at all. This is the approach taken by Levin & Rappaport Hovav (1995) (henceforth L&RH), which remains one of the most important works on split intransitivity to date. They show that not all "unaccusative" verbs are

### 18 Rethinking split intransitivity

telic (pp. 172–173), which is the same position taken here. But it is not therefore possible on their approach to capture a link between telicity and argument structure, which is problematic as many authors (for example, Tenny 1987, Borer 2005) have presented evidence for just such a link, in English and other languages. For instance, Kiparsky (1998) links telicity to case in Finnish:

(30) Finnish

a. Ammuin I.shot karhu-*a*. bear-part 'I shot at the bear.' b. Ammuin karhu-*n*.

I.shot bear-acc 'I shot the bear.'

Case is of course often related to the relative positions of arguments, which suggests it is appropriate to link telicity to argument structure. This is lost on L&RH's approach.

A L&RH-style approach which did make reference to telicity might not fare much better, however. For them, verbs must classify as either unergative or unaccusative (see §3.5 for discussion of how this is achieved): they would not capture how a verb like *melt* patterns with *break* (unaccusative) in terms of the resultative construction but with *work* (unergative) in terms of the *for hours* diagnostic. We cannot get around this problem by positing that *melt* is unergative when it is atelic but unaccusative when telic. It still shows the properties of an "unaccusative" in clearly atelic contexts, for example it allows the resultative construction (a prototypical diagnostic of unaccusativity; restricted to [−initiation, +change] verbs):

(31) The butter melted soft for hours.

This sort of pattern is not an issue on the VISCO approach, however. On this approach *for hours* and the other diagnostics are simply sensitive to separate features, separately encoded in syntactic structure, and overlaps between classes are not a problem.

The VISCO approach can be further compared in this regard to another important strand of work on split intransitive phenomena, labelled the "semantic approach" by L&RH (§1.2.2). Whilst Perlmutter's original conception of unaccusativity made reference to both syntax and semantics, the semantic approach

### James Baker

attempts to explain split intransitive patterns in terms of semantics alone, without reference to syntactic notions such as the structural positions of arguments.<sup>4</sup> This approach denies that the difference between unergative and unaccusative predicates relates to syntactic structure, and instead claims that the distinction between the two is entirely due to the sensitivity of the diagnostic constructions to different semantic values of the predicate. Works which adopt this sort of approach include Van Valin (1990) and Zaenen (1993). Zaenen, for example, argues that the availability of prenominal past participles in Dutch is sensitive to telicity (32), whereas the availability of impersonal passives is sensitive to "protagonist control" (33):



The semantic approach does not predict that all purported "unaccusatives", or all purported "unergatives", need behave in the same way. Different diagnostics may pick out separate, if overlapping, groups of verbs. This insight is retained in the VISCO approach. In addition to the examples discussed above, observe for example that various verbs allow prenominal past participles but disallow resultatives:

<sup>4</sup>L&RH also identify the "syntactic approach" (§1.2.1), exemplified with Rosen (1984). Contrary to L&RH's implication, however, this is not the direct opposite to the semantic approach – while Rosen argues that unaccusative behaviours are not wholly determined by semantics, she still seems to allow some role for it.

18 Rethinking split intransitivity

	- a. the burned bacon
	- b. the recently arrived recruits
	- c. the departed visitor
	- a. The bacon burned black.
	- b. \* Lucy arrived tired. [intended meaning: 'Lucy became tired as a result of arriving']
	- c. \* Chris departed tired. [intended meaning: 'Chris became tired as a result of departing']

An approach which makes reference to semantics can elegantly account for mismatches of this sort simply by postulating that the two constructions are sensitive to different sets of semantic features ([−initiation,+change] for resultatives; [+change] alone for prenominal past participles). This is exactly what is done in the VISCO approach, somewhat following the semantic approach. Different diagnostics pick out different classes of verbs, summarised for English in Table 18.1 (for further discussion see Baker 2018; 2019).<sup>5</sup>

> Table 18.1: Summary of classes identified by English split intransitivity diagnostics


A further advantage of the semantic approach is its ability to capture straightforwardly the semantic basis of split intransitive behaviours. Many diagnostics pick out a set of verbs which can be defined in relatively clear-cut ways. Thus, each class has a well-defined semantic characterisation, unlike either of the

<sup>5</sup>Though the discussion here focuses on English, similar remarks can be made about other languages.

### James Baker

"unergative" or "unaccusative" classes. For example, as I have argued in Baker (2016; 2018; 2019) and also discussed above, a number of diagnostic constructions in English are acceptable for the most part only with those intransitives that can be characterised as [−state, −change] (like *talk*, cf. [+state] *remain* and [+change] *arrive*):<sup>6</sup>

	- b. Lucy was talking/\*remaining/\*arriving away.
	- c. Lucy talked the talk/\*remained the remaining/\*arrived the arrival.
	- d. talker, \*remainer, \*arriver
	- e. Lucy outtalked/\*outremained/\*outarrived Chris.

On the other hand, as again already mentioned, the causative alternation and the resultative construction seem to be limited to intransitives characterisable as [+change, −initiation]:


	- b. \* Lucy arrived tired. [intended meaning: 'Lucy became tired as a result of arriving']
	- c. \* Lucy talked tired. [intended meaning: 'Lucy became tired as a result of talking']

A semantic approach to these phenomena, making no reference to syntactic grammatical relations or argument positions, would be able to capture the behaviour of these constructions by reference to the semantic features alone. This has the apparent advantage of not having to make additional reference to an


But such forms are generally sporadic exceptions and mostly do not seem to reflect any underlying generalisation; speakers' judgements regarding them are often weaker.

<sup>6</sup> It is true that these constructions are sometimes found with unaccusatives: e.g.

### 18 Rethinking split intransitivity

additional concept of "unaccusativity", thus allowing for an apparently simpler grammar. The VISCO approach shares this advantage, defining classes in terms of semantic features with no separate concept of unaccusativity.

However, the semantic approach misses some important generalisations which appear to connect split intransitivity to argument structure. Levin & Rappaport Hovav (1995: 11–12) discuss the example of prenominal past participles, which may only modify what would be "internal arguments" in the equivalent clausal constructions, under a standard Burzio-type approach to syntactic structure:

	- b. Internal argument of intransitive (unaccusative): *a recently appeared book*
	- c. External argument of transitive: \**a much-painted artist*
	- d. External argument of intransitive (unergative): \**a hard-worked lawyer* (L&RH: 11)

As was exemplified in §3.3, the ability to capture this sort of parallel between intransitives and transitives is an important strength of the traditional unaccusative hypothesis, and indeed of any implementations of it which make reference to grammatical relations or argument positions. L&RH argue, however, that the semantic approach fails to account for such parallels satisfactorily, as there is no single semantic notion that all "internal arguments" have in common – Van Valin's (1990) appeal to an "undergoer" macrorole, they claim convincingly, cannot be considered truly semantic but rather a generalisation over a number of specific semantic roles. This, then, is a major weakness of the semantic approach.

The VISCO approach, however, overcomes this weakness. As discussed in §3.2, it is able to account for parallels between transitives and intransitives in structural terms. However, because it adopts a more fine-grained approach to the structure of the thematic domain of the clause, and because this structure is explicitly connected to semantic features ([±volition], [±initiation] etc., valued on the functional heads), it is also able to take into account the semantic basis of split intransitive patterns as effectively as the traditional semantic approaches.

Another partial solution to the issue of overlaps between classes may be found in the work of Ramchand (2008), who proposes the following structure for the thematic domain – a fairly significant departure from traditional assumptions:

Arguments can be merged in the specifier positions of any of these three heads (the complement positions of *proc* and *res* are also available for arguments, but these do not seem to be filled in one-argument verbs). An argument merged in Spec,*init*P is termed an "initiator", that in Spec,*proc*P an "undergoer" and that in Spec,*res*P a "resultee". The same argument can be merged in more than one of these positions: thus for example *run* is an [*init*, *proc*] verb and its argument is both initiator and undergoer, and *arrive* is [*init*, *proc*, *res*] so its argument is simultaneously initiator, undergoer and resultee, whereas *roll* has only a *proc* projection and thus its argument is only an undergoer. Thus, there are not just two possible configurations for intransitive predicates (as suggested under the traditional unaccusative hypothesis), but multiple possibilities.

As a result of this, Ramchand's approach can account for certain of the discrepancies between split intransitivity diagnostics. Not only may the arguments of different predicates appear in more than two different positions – which itself allows for split intransitive diagnostics sensitive to argument structure to pick out more than two classes – the argument of a single given predicate may appear in multiple different positions at once, allowing it to be picked out by multiple argument-structure-sensitive diagnostics even if they are sensitive to different factors.

For example, the causative alternation is on Ramchand's analysis restricted to those intransitive verbs which lack an *init* component. This is independent of telicity, which is connected (in part) to the presence or absence of *res*. Ramchand thus accounts for both diagnostics in structural terms, without making the false prediction that (for example) all intransitives with causative alternants are telic. This prediction is shown to be false by examples such as the following:

	- b. Lucy melted the lollipops.

### 18 Rethinking split intransitivity

However, there are some patterns Ramchand's approach does not so obviously account for. For example, it does not identify the [+change] class, which I have argued in favour of in §2. 7 In Baker (2018), I identify this and other problems, arguing at length the patterns are more readily accounted for in terms of a more elaborated sequence of heads. The parallels between Ramchand's and my approach are, however, very strong, even if the particular heads identified are different.

The VISCO hierarchy approach, then, allows diagnostics to pick out overlapping classes without encountering these issues. Recall that verbs like *melt* pattern both with verbs like *work* (in terms of diagnostics of telicity like *for hours*) and with verbs like *break* (in terms of other diagnostics: resultatives, causatives, prenominal past participles). If we assume all of these diagnostics are connected to argument structure, this is difficult – if not impossible – to account for on the assumption that there are only two available argument positions in intransitives. Either telicity or the other diagnostics must be sensitive to argument structure on this more traditional approach; it does not seem that they can both be. However, if we allow for the possibility of multiple argument positions – and specifically multiple "internal" argument positions – we are able to account for both sets of phenomena in argument structure terms.

### **3.4.4 Non-classified verbs**

This section considers the problem, for the traditional unaccusative hypothesis, of predicates which apparently cannot be classified as unergative or unaccusative. For the unaccusative hypothesis to be tenable, there should be some way of identifying any given intransitive predicate one way or the other. This is desirable not only from a theoretical perspective (we do not wish to be making claims about the status of predicates on an ad hoc basis) but also from an acquisitional one: the language learner needs some method by which verbs can be identified as belonging to one class or the other.

The obvious method to determine the status of predicates as unergative or unaccusative is via the various "unaccusativity diagnostics", a number of which have already been discussed. These are morphological or syntactic constructions which permit some intransitives to participate but disallow others. But matters

<sup>7</sup>A reviewer suggests that the [+change] property as identified by the prenominal past participle diagnostic might be related to the non-finite participle morphology, rather than as part of the extended structure of finite verbs. It seems to me most economical to assume that the structure of finite and non-finite forms does not differ in this way; in any case, this does not account for the apparent operation of the [±change] feature in other ways.

### James Baker

are not as straightforward as might be hoped. Consider, in English, verbs denoting states. As discussed above, and illustrated in more detail immediately below, these are rather consistently disallowed with both constructions purportedly diagnostic of unergatives (43) and those diagnostic of unaccusatives (44):<sup>8</sup>

(43) a. \* Lucy remained her way into the room.

	- b. \* Lucy remained Chris. [intended meaning: 'Lucy made Chris remain']
	- c. \* the remained man

Similar observations can be made of many other verbs denoting states: *stay*, *last*, *survive*, *persist*; *sit*, *stand* (in their stative senses), etc. It is true that these do sporadically allow certain of the diagnostic constructions (e.g. *survivor*, *outstay*; *Lucy stood the statue in the corner*), but such behaviours do not seem to form part of any general pattern and it is not clear that they do much to resolve the issue.

There is one diagnostic that does group statives with other verbs: that of telicity. Statives can freely occur with phrases like *for hours*:

(45) Lucy remained for hours.

But as discussed in §3.4.3, the classes identified by this diagnostic do not line up neatly with those picked out by the other diagnostics: both "unergative" and "unaccusative" verbs can occur with *for hours*, so the telicity diagnostic does not solve the problem.

Of course, it is notionally plausible that the division of predicates into one or the other class derives from some sort of innate knowledge. Such knowledge

<sup>8</sup>*Remain* and many other statives are permitted with locative inversion and *there*-insertion:

<sup>(</sup>i) In the room remained a man.

<sup>(</sup>ii) There remained a man.

However, I do not consider these true diagnostics of argument structure: see Levin & Rappaport Hovav (1995); Baker (2018; 2019: Ch. 6).

### 18 Rethinking split intransitivity

would most probably be specific to the language faculty – it is hard to see how the mechanisms which allow the linking of semantics to grammatical relations or syntactic argument structure could have any non-linguistic applications. However, appeals of this sort to "Universal Grammar" (UG) are not very compatible with a minimalist approach, which favours an "impoverished" view of UG. This thus avoids the methodological error of appealing to innateness too readily, and failing to seek out deeper explanations. UG is predicted to contain as little as possible, and we ought not to be placing the mechanisms for distinguishing unergatives from unaccusatives within it if better options are available. That is, we ideally do not want a UG principle which states "Intransitive predicates denoting changes and states are unaccusative; others are unergative" or the like.

This sort of UG approach would also run into problems with cross-linguistic variation. If, as seems to be the case, languages vary to some extent as to how they classify intransitives, then it would seem UG would only provide partial information as to how this classification is to proceed. This would leave us with the problem of determining what information is, and is not, in UG – a problem which is by no means easy to solve. This problem is perhaps particularly apparent with intransitives denoting states. State verbs often show a great deal of languageinternal variation and apparent lexical idiosyncrasy with regard to split intransitivity diagnostics. In Dutch, for example, some state verbs occur with 'have' and others with 'be':

### (46) Dutch (Sorace 2000: 870)


Similar observations can be made with regard to case marking of statives in Basque and Georgian (Baker 2018). This suggests UG does not provide much if any help in the classification of these verbs into one of the two purported groups.

Given all this, how can the language learner (or the linguist) determine whether English stative verbs are unergative or unaccusative? They have generally been assumed to belong to the latter class (see Perlmutter 1978: 162–163), but as we have seen there is little positive evidence in support of this, only the negative evidence that they do not generally pattern with the "unergatives".

### James Baker

Another possibility is that UG, or general cognitive procedures, allow for some sort of "default rule", whereby verbs for which there is no positive evidence as to their status are classified into one particular group. However, it is not clear (at least at present) how we might determine which of the two groups is the default, suggesting we ought not to pursue this path if an alternative can be found.

The VISCO approach may be just such an alternative. It does not run into problems with [+state] intransitives. As there is no requirement on this approach for these verbs to be classified into one of just two groups, the fact that the diagnostics do not allow us to do so is not problematic. Rather, stative verbs can simply be grouped into a class of their own.

### **3.4.5 Exceptional verbs**

Another problem for the unaccusative hypothesis concerns verbs which, having been classified as either unergative or unaccusative, fail to show particular behaviours expected of the group in question. In English, this is particularly problematic for the unaccusative class. Because not all purported "unaccusatives" behave in the same way in relation to the diagnostics, authors working within the unaccusative hypothesis framework must posit reasons for the "exceptional" behaviour of certain sorts of predicate. Thus, for example, Levin & Rappaport Hovav (1995) provide arguments for the incompatibility of resultatives with directed motion (§2.3.2) and stative (§2.3.3) intransitives, and for the incompatibility of the causative alternation with verbs of existence and appearance (§3.3, see especially p. 126). This sort of approach – whereby some members of a class whose members are able to enter into a given construction for one reason (such as the presence of an internal argument/absence of an external argument) are ruled out in that construction for some other reason – is not inherently problematic. L&RH's use of it in this instance, however, runs into various problems.

Firstly, note again that the resultative construction and the causative alternation are available in English with very almost the same class of verbs (see also Baker 2018; 2019):<sup>9</sup>

<sup>9</sup>The major exception is the class of verbs comprising *redden*, *blacken*, *ripen* etc. which allow causatives but not resultatives:

<sup>(</sup>i) The wood blackened.

<sup>(</sup>ii) The fire blackened the wood.

<sup>(</sup>iii) \* The wood blackened black.

One explanation for this property is that the result state (*red*, *black*, *ripe* etc.) incorporates directly into the verbal element *-en*.

	- b. The wood burned black.
	- c. The window broke into pieces.
	- d. \* Lucy arrived tired. [intended meaning: 'Lucy became tired as a result of arriving']
	- e. \* Lucy persisted happy. [intended meaning: 'Lucy became happy as a result of persisting']
	- b. Chris burned the wood.
	- c. Chris broke the window.
	- d. \* Chris arrived Lucy.
	- e. \* Chris persisted Lucy.

This correspondence occurs even with verbs which otherwise appear to be idiosyncratic exceptions to the non-availability of these constructions:

	- b. \* Chris died Lucy.

L&RH's approach, however, does not account for this generalisation of close correspondence between the classes picked out by the two diagnostics. This correspondence cannot be explained simply by claiming that the constructions are only available with "unaccusatives", because the pattern is subtler than that: not all intransitives claimed to have internal arguments allow the two constructions. Further, L&RH's arguments for the incompatibility of resultatives with certain "unaccusatives" do not generalise to the incompatibility of these same verbs with the causative alternation (and vice versa). The inherent delimitation of directed motion verbs may be a satisfactory account of their non-occurrence with resultatives (L&RH: §2.3.2), but it does not seem relevant to the causative alternation; similarly, while it may be reasonable that there are no such things as delimited states and hence no resultatives of statives, as resultatives (§2.3.3), this line of argument does not obviously extend to the lack of causative alternants of stative forms.

Likewise, L&RH's argument for the non-occurrence of causative alternants of verbs of existence and appearance does not account for the non-occurrence of these verbs with resultatives:

### James Baker

	- b. \* Lucy vanished invisible.

L&RH argue (p. 126) that these verbs lack causative alternants because they have neither external nor internal causes, but this is irrelevant to whether they allow resultatives on their analysis (though cf. Ramchand 2008; Baker 2018; 2019). They do not provide any argument for the non-occurrence of causatives with other "unaccusatives".

To reiterate, then, L&RH fail to capture significant similarities between the behaviour of these two diagnostics. In this respect, then, they can be argued to do less well than the "semantic"-type approaches, which are able to identify features the verbs allowing resultatives and causatives have in common ([+change, −initiation], as I suggested above). The fact that this class can be identified positively in terms of these features alone, rather than positing an unaccusative class and various exceptions, might also seem to favour something more like a semantic approach – or the VISCO approach. We can make a similar argument regarding prenominal past participles, some examples of which are as follows:

	- b. the broken window
	- c. \* the survived man
	- d. \* the been man
	- e. \* the swam athlete

L&RH would have to provide some reason to rule these out with statives, whereas the VISCO approach need only state that they are restricted to verbs of change. On this approach, as discussed above, there is no expectation that different split intransitivity diagnostics should all identify more-or-less the same two classes, and indeed this is not what we observe. Accordingly, there is a reduced need to explain away the apparent exceptions: the cases where certain verbs do not behave as their class membership predicts.<sup>10</sup> However, as already noted, the VISCO approach also has the advantage over traditional semantic approaches in that it

<sup>10</sup>It is true that there are some [+change, −initiation] verbs that do not allow resultatives and/or causatives: among them, *die*, verbs of (dis)appearance and verbs like *redden*, *blacken* etc. There are also exceptions to the rule that [+change] verbs allow prenominal past participles (*\*the gone man*, etc.). The number of exceptions to be accounted for is still less than on an approach that treats these constructions as in principle available with all "unaccusatives". See discussion in Baker (2018; 2019).

### 18 Rethinking split intransitivity

nevertheless connects class membership to syntactic structure and accordingly is able to account for particular patterns which those other approaches do not.

### **3.4.6 Variation between languages**

Variation in split intransitive phenomena has been highlighted by various authors, for example Rosen (1984) cited above, and more recently in the work of Sorace (see particularly Sorace 2000, forthcoming). Sorace (2000) describes variation in auxiliary selection in German, Dutch, Italian and French: these languages all allow either 'be' or 'have' as the auxiliary in the periphrastic perfect, with 'be' traditionally held to occur with unaccusatives and 'have' with unergatives. However, the distribution of 'be' and 'have' is different in each language. The following examples illustrate:

```
(52) French
```
Il he *a* has courru. run 'He ran.'

```
(53) German
   Er
   he
       ist
       is
          gelaufen.
          run
   'He ran.'
```
Sorace shows, however, that this cross-linguistic variation in amenable to analysis in terms of a hierarchy of semantic categories of intransitive verbs: the auxiliary selection hierarchy (ASH) or split intransitivity hierarchy (SIH) (Sorace & Shomura 2001). This is given in Table 18.2. Whilst the "cut-off point" between 'have' verbs and 'be' verbs varies between languages, in general categories toward the top of the hierarchy are associated with 'have' and those toward the bottom with 'be'.

Further research has shown that the SIH can be applied to split intransitive phenomena other than auxiliary selection (Sorace 2004: 263–264, Montrul 2005), although it may not apply in all cases (Baker 2013; 2018).

Most theoretical accounts of split intransitivity have little to say about what, if any, cross-linguistic variation should be possible. However, as Sorace's work shows, languages not only seem to vary in which predicates show "unergative" and "unaccusative" behaviours, but this variation appears not to be purely random.

James Baker

Table 18.2: The split intransitivity hierarchy (Sorace 2000)


The VISCO hierarchy, however, can be seen as an implementation of the SIH. Categories closer to the top of the SIH correspond to positively valued features of heads towards the top of the VISCO hierarchy, as summarised in Table 18.3 (for further discussion see Baker 2019).

> Table 18.3: Correspondences between the SIH and the features encoded on the heads of the VISCO hierarchy


This enables explanation of why split intransitive patterns show the patterns of variation they do, something which is not furnished by other theories. This is most easily illustrated with auxiliary selection, which is also the phenomenon best studied with relation to the SIH. The generalisation which can be made is that in languages with a 'have'/'be' split amongst auxiliaries in the periphrastic perfect with intransitives, 'be' is associated with heads below a certain point

### 18 Rethinking split intransitivity

when they bear a positively valued feature (e.g. with oriented when it bears [+oriented], but not where it bears [−oriented]); 'have' is then associated with heads bearing a positively valued feature above that point. The cut-off point in question, however, varies between languages.<sup>11</sup>

To briefly summarise this discussion, it has considered the problem of attempting to divide intransitives into just two groups, which is manifest in various ways. It has argued that the VISCO approach, which identifies multiple classes of intransitives in a way which is connected directly to syntactic structure, is able to overcome this problem where other approaches run into difficulties.

### **3.5 The problem of semantics–syntax linking**

A further issue with the unaccusative hypothesis as originally proposed concerns the proposed relation between semantics and syntax. The link between the meaning of an intransitive predicate and that predicate's status as unergative or unaccusative is not nearly as straightforward as might be thought ideal. The proposed unergative and unaccusative classes each divide into a number of subgroups, and the semantic characterisations of each are somewhat heterogeneous: there is no immediately apparent semantic feature that all the predicates in one of the classes possess and all the others lack. Thus, the semantic criteria that Perlmutter (1978) provides (17–18) are not necessarily very informative – in particular, the notion of "semantic patient" (18b) is unhelpfully vague. And some of the classifications seem rather arbitrary – why, for example, should "involuntary bodily processes" be unergative rather than unaccusative?

This problem is overcome somewhat by Levin & Rappaport Hovav (1995). Considering a number of diagnostics in a high level of detail, primarily although not exclusively in regard to English, L&RH argue in favour of a traditional interpretation of unaccusativity, where the two classes of intransitive predicates (unergative and unaccusative) are both semantically determined and syntactically represented.

The mapping of semantics to syntax on L&RH's approach is achieved via "linking rules" (Ch. 4). The rules that L&RH identify are as follows:

(54) a. Directed change linking rule

"The argument of a verb that corresponds to the entity undergoing the directed change described by that verb is its direct internal argument." (p. 146)

<sup>11</sup>This does not by itself account for all the patterns captured by the SIH; for further discussion of how this may be done on a TFH approach see Baker (2018).

### James Baker


To summarise, with examples of verbs whose arguments are typically subject to each rule:


These rules are ordered (L&RH: §4.2). Thus, for example, the directed change linking rule (54a, 55a) takes precedence over the immediate cause linking rule (54b, 55b) in at least some languages (L&RH: 159–164, 166). Thus an entity which both undergoes a direct change and is an immediate cause of the eventuality is represented by an internal argument, not an external one. This is apparent, for example, in the case of Italian *cadere* 'to fall' which takes auxiliary *essere* 'to be' (associated with unaccusatives) even when agentive (L&RH: 163):

(56) Italian

Luigi Luigi è is caduto fallen apposta. on.purpose 'Luigi fell on purpose.'

In summary, on L&RH's approach intransitive predicates with an immediate cause argument are unergative *unless* that argument also undergoes a directed change or has its existence asserted or denied. All other intransitives are unaccusative.

The principal advantage, then, of L&RH's approach – as opposed to previous attempts to characterise unaccusativity – is an explicit characterisation of the

### 18 Rethinking split intransitivity

different behaviour of different intransitive predicates in semantic terms, whilst however directly relating these behaviours to the syntactic property of the position of arguments. L&RH are thus able to maintain certain advantages of the "semantic" approach, whilst overcoming some of its weaknesses by building on existing insights into syntactic determinants of split intransitive phenomena. Furthermore, the semantic characterisation it presents is relatively straightforward – relying only on the concepts of "direct change", "immediate causation" and "assertion/denial of existence". This compares favourably to the numerous semantic categories identified by Perlmutter (1978), allowing significant generalisations to be made as to which predicates fall in which class.

The linking rules approach is not without weaknesses of its own, however. Some of these concern the rules themselves (I will discuss other weaknesses subsequently). Now, it seems undeniable that we need some way of linking semantics to syntax if split intransitivity is indeed sensitive to both. The idea of linking rules is not problematic per se. But the specific forms of the rules L&RH suggest are. They seem largely accurate in describing the classes of verbs which show "unergative" and "unaccusative" behaviours: though they have some weaknesses even in this regard, which I shall discuss below. But despite this strength in terms of purely descriptive classification, it is difficult to come up with independent, explanatory reasons for why they should have the forms they do. Why are they sensitive to these semantic factors, and not others? One can think of plenty of other factors which might just as well be candidates (e.g. volition, sentience, eventivity/stativity, telicity, affectedness; "change" as a general concept rather than directed change specifically).<sup>12</sup> The basis for the connections between these semantic factors and the external/internal argument distinctions are in some cases similarly unclear. Why, for example, should assertion of existence be a criterion that yields unaccusatives, and not unergatives? Neither is it easy to justify the order of the rules. Why should the directed change linking rule take precedence over the immediate cause linking rule, and not vice versa?

These are problematic issues from an acquisitional perspective. The forms of the rules – the semantic features they make reference to, the mapping to external or internal arguments, their ordering – seem rather arbitrary. This arbitrariness can only make the acquisition process more difficult, particularly when the data that are available to help language learners classify predicates one way or the other are often limited at best.

A potential source of evidence for the mapping one way or the other is the behaviour of arguments of transitive verbs: most clearly for the directed change

<sup>12</sup>L&RH do discuss (§4.3.1) their reasoning for rejecting some of these, but this does not explain why language learners do not posit them.

### James Baker

linking rule, as transitive arguments which undergo directed changes are internal arguments, for example *the city* in the following case:

(57) Hannibal destroyed the city.

But this reasoning may not generalise to all the rules. True, causes and "other" arguments are generally external and internal arguments of transitives respectively, as in the following example:

(58) Lucy touched the wall.

Here, *Lucy* (the external argument) is the immediate cause of the event and *the wall* (the internal argument) does not come under the scope of any of the rules. Instances like this could allow the derivation of the immediate cause and default linking rules. But psych predicates pose a problem, for example:

(59) Sarah loves Chris.

Here, *Sarah* (the external argument) is not necessarily best analysed as a cause, and *Chris* (the internal argument) may well be. Thus the mapping to syntactic positions exhibits the opposite pattern from that the linking rules would generate. It is also not clear if transitives provide any evidence as to the status of an argument of which existence is asserted or denied.

One solution would be to posit that the linking rules, and maybe their order as well, are encoded in Universal Grammar. But this does not seem very attractive, particularly if a better proposal can be made. Most linguists today would probably reject such a "rich UG" approach.

Ideally, perhaps, learners would have access to some sort of generalised linking rule format on which all the rules might be based (this might be either innate or emergent). It is not clear that L&RH's linking rules can be reduced to a satisfactory general format: certainly there does not seem to be one which overcomes the problems of the arbitrariness of the semantic factors and of whether each factor maps to external or internal arguments. The issue of the ordering of the rules would remain problematic in any case.

In Baker (2018; 2019), however, I propose exactly this sort of "generalised linking rule" which, utilising a VISCO-type hierarchy, overcomes these problems with L&RH's rules:

(60) Generalised linking rule

An argument of which the property [+a] is predicated is merged in the corresponding Spec,AP.

### 18 Rethinking split intransitivity

The properties [+a] in question are the features [+volition], [+initiation] etc.; the corresponding APs are VolitionP, InitiationP, etc. The general format of the rule allows for much easier acquisition, and there is no need to order the rules so that certain semantic features take precedence over others, which obviates the need to justify such a rule ordering, or for learners to acquire it. (Where two properties are predicated of an argument – say, [+initiation] and [+change] – that argument is simply merged in both corresponding positions, as discussed above.)

The VISCO approach does not require us to posit seemingly arbitrary associations of semantic properties to external or internal argument positions: on this approach, the two-way division between "external" and "internal" arguments is too simplistic. A related issue does still present itself, however: why are the heads ordered in the way they are? (Note that this is a problem here only with the syntactic structure itself – it is external to the linking rules.) From an acquisitional perspective, however, it is not such an issue as it might first appear: there is ample evidence from transitive and ditransitive clauses for the order of at least some of the heads in the hierarchy.<sup>13</sup> For example, θ-initiation arguments always seem to be merged higher than θ-change ones:

(61) Hannibal destroyed θ-initiation the city. θ-change

This allows the learner to posit InitiationP as being higher in the structure than ChangeP. See Baker (2018) for in-depth discussion.

As to why the particular order of heads should have come about in the first place, I admit I do not have a full explanation. Such deep explanations for the ordering of heads in syntactic structures are of course a more general issue not restricted to the particular subpart of sentence structure posited here. One partial explanation may be that the heads higher in the structure (e.g. volition, initiation) relate more to the properties of the arguments themselves, whereas the lower heads (e.g. change, oriented) say more about the properties of the event. But this is incomplete and subject to criticism. Overall, however, the VISCO approach, with the generalised linking rule, allows a neat way of capturing the linking between semantics and syntax which does not run into some of the problems encountered by previous accounts.

<sup>13</sup>I admit I am not aware of much good language-internal evidence for the relative order of, firstly, volition and initiation and, secondly, change and oriented – though see §3.4.4 for some cross-linguistic evidence for the orders posited. It is not clear that much hinges on which orders the learner adopts in these cases, however.

### James Baker

## **4 Conclusion**

Perlmutter's (1978) unaccusative hypothesis has remained a powerful idea since its inception. Numerous linguistic phenomena have shown themselves to be amenable to analysis in terms of unaccusativity. But the hypothesis, and subsequent implementations and adaptations of it, have also proved problematic in various ways. My approach to split intransitivity, captured in terms of the VISCO hierarchy, overcomes many of these difficulties by positing more fine-grained distinctions in syntactic structure. However, it retains key elements of the original unaccusative hypothesis: the idea that split intransitive behaviours are semantically determined but syntactically encoded, specifically in terms of "grammatical relations" here formalised (after Burzio 1986 and many others) in terms of argument positions. The VISCO approach to split intransitivity should be seen, therefore, not as a radical alternative to the unaccusative hypothesis but as a development of it.

## **Abbreviations**


## **References**

Baker, James. 2013. *Theoretical approaches to alignment, with special reference to split/fluid-S systems*. University of Cambridge. (Undergraduate dissertation).


Baker, James. 2019. Split intransitivity in English. *English Language and Linguistics* 23(3). 557–589. DOI: 10.1017/S1360674317000533.

18 Rethinking split intransitivity


Harris, Alice C. 1981. *Georgian syntax*. Cambridge: Cambridge University Press.


### James Baker

*ity puzzle: Explorations of the syntax–lexicon interface*, 207–242. Oxford: Oxford University Press.


# **Chapter 19**

# **The verbal passive: No unique phrasal idioms**

Julie Fadlon Tel Aviv University

Julia Horvath Tel Aviv University

Tal Siloni Tel Aviv University

Ken Wexler Massachusetts Institute of Technology

> The paper reports and discusses two studies we conducted to systematically assess the distribution of English phrasal idioms across various diatheses (transitive, unaccusatives, adjectival and verbal passives). Both studies, a quantitative survey of idiom dictionaries and an experiment using invented idioms, show that the distribution of phrasal idioms depends on the diathesis of the idiom's head. While transitives, unaccusatives and adjectival passives can head idioms specific to them, verbal passive idioms uniformly have a transitive (active) version. This pattern, we argue, shows that phrasal idioms are stored in the (pre-syntactic) lexicon as subentries of the entry of their head (not as independent entries). Further, it reinforces proposals that the verbal passive is a post-lexical output, which consequently lacks its own lexical entry, contrasting in this respect with the other diatheses we examined. Our findings also provide evidence that the lexicon comprises derived entries, which we take as indication that it is an active component of grammar.

Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler. 2020. The verbal passive: No unique phrasal idioms. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 421–459. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972870

Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

## **1 Introduction**

It has sporadically been observed in the literature that there is a gap in the distribution of idioms. Specifically, Dubinsky & Simango (1996), Marantz (1997), and Ruwet (1991) report that in Chichewa, English, and French there do not seem to be any idioms specific to the verbal (eventive) passive (e.g., *sold* as in *the first costumer was sold the car*), while there are idioms specific to the adjectival (stative) passive (e.g., *shaven*). In other words, an idiom in the verbal passive must have a transitive (active) version.

A first quantitative survey checking the validity of these observations is reported in Horvath & Siloni (2009) regarding Hebrew. The survey examined four diatheses: the verbal passive, adjectival passive, transitive, and unaccusative. The results of the survey showed that the unaccusative and adjectival passive diatheses can have idioms that do not have a transitive version, and idioms in the transitive diathesis do not necessarily have an unaccusative version, as illustrated in (1–3) below. But the verbal passive always shares its idiomatic meaning with the transitive alternant ("#" means the corresponding sequence of words does not have an idiomatic meaning).<sup>1</sup>

	- a. Unaccusative yaca le-*x* me-ha-af

went.out to-*x* from-the-nose

'got tired of' b. Transitive

# hoci took.out le-*x* to-*x* me-ha-af from-the-nose

### (2) Hebrew

```
a. Adjectival passive
dafuk
knocked
          ba-roš
          in.the-head
'stupid'
```
b. Transitive # dafak knocked et acc *x x* ba-roš in.the-head

<sup>1</sup>The citation form in Hebrew is third person singular past; glosses and translations match this; the idioms, of course, are not limited to the past tense. For the sake of clarity, a lexically nonfixed constituent in Hebrew idioms is marked by *x*.

### 19 The verbal passive: No unique phrasal idioms

### (3) Hebrew


# nosaf got.added šemen oil la-medura to.the-fire

Further, Siloni et al. (2018) report an experiment on Hebrew speakers that reinforces the claim that the distribution of phrasal idioms depends on the diathesis of their head. Participants in the experiment perceived the likelihood of the verbal passive to share idiomatic meanings with its transitive counterpart as significantly higher than that of both the adjectival passive and the unaccusative.

This paper advances the claim that the reason why the verbal passive differs from the other diatheses regarding the ability to appear in idioms specific to the diathesis is an inherent (independently motivated) difference between the former and the latter that affects the storage possibilities available to each.

As is well known, idioms exhibit an inherent duality. On the one hand, they are associated with an unpredictable, conventionalized meaning, which must be stored (listed) in mental representations. On the other hand, they are units with internal syntactic structure parallel to units built in the syntax. Further, idioms are constructs that interact with grammar – they can be embedded, can allow passivization, etc. This means that they must be stored intra-grammatically, that is, in the lexical component of grammar. We will claim that idioms in the verbal passive cannot be stored the way the adjectival passive, unaccusative (more generally, anticausative) and transitive idioms are stored. This, in turn, will account for their inability to head their "own" idioms.

The paper first addresses the question of the crosslinguistic validity of the quantitative results mentioned above with regard to Hebrew. This is particularly important since the verbal passive in Hebrew is known to occur with relatively low frequency in spoken language in comparison to its English counterpart (Berman 2008). This may be argued to potentially affect the inventory of verbal passive idioms in Hebrew. It is thus essential to examine the situation in another language, where usage of the verbal passive is more frequent. In order to do that, we conducted a quantitative study of the distribution of idioms in English.

Two additional factors make this comparative extension even more worthwhile. First, the passive morphology in Hebrew versus English is of a different

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

type. While the Hebrew verbal passive is formed by means of a verbal template, in English, the verbal passive is periphrastic, and formed by use of an auxiliary verb. Second, a comparative study by Meltzer-Asscher (2012) argues that the verbal passive in English versus Hebrew also differ with regard to the realization of the demoted external θ-role. While in English various diagnostics detect the syntactic presence of the external θ-role (e.g. Jaeggli 1986; Baker et al. 1989; Collins 2005), in Hebrew, the role is not syntactically present, but is assigned to a variable in the semantic representation (along lines suggested by Chierchia 2004; Reinhart 2003; Horvath & Siloni 2009, among others).

Since the term "idiom" is pre-theoretic and refers to various types of fixed expressions, we adopted Horvath & Siloni's (2017; 2019) definition identifying a core set of idioms, which has allowed us to test a coherent set of expressions. The set consists of conventionalized multilexemic expressions whose meaning is figurative (metaphoric) and unpredictable by semantic composition.<sup>2</sup>

Further, following Horvath & Siloni (2017; 2019), we distinguish between phrasal and clausal idioms: the former are headed by a lexical head, while the latter involve sentential functional material such as tense etc., as defined in (4), and illustrated in (1–3) and (5) respectively. This paper deals with phrasal idioms. For more on the analysis of clausal idioms, see Horvath & Siloni (2019).

	- a. Phrasal idioms are headed by a lexical head (e.g. 1–3).
	- b. Clausal idioms are headed by a sentential functional head (a fixed tense or mood, a modal, obligatory sentential negation or CP-material); they are not necessarily full clauses (e.g. 5, where the modal and negation are fixed)


<sup>2</sup> For the sake of clarity, it is worth noting that a property often mistakenly conflated with the unpredictability of idioms' meaning is the level of opacity or transparency of their meaning. Idioms indeed differ from one another in the level of their transparency (opacity). For example, the idiom in (ii) may be felt less transparent than the one in (i). However, the degree of transparency can be determined only once we know the meaning of the idioms; neither the former nor the latter meanings can be predicted based on the meaning of their building blocks. Hence, the meanings of (i) is unpredictable (even if a posteriori, more transparent) just like that of (ii). Both types of idioms are included in our study.

### 19 The verbal passive: No unique phrasal idioms

(5) wouldn't put it past someone

'Consider it possible that someone might do something wrong or unpleasant.'

In §2 we report a survey of four English idiom dictionaries, examining the distribution of phrasal idioms across four diatheses: the verbal passive, adjectival passive, transitive, and anticausative.<sup>3</sup> Such surveys are necessary for the study of idiom distribution; intuitions in themselves are not sufficient, as speakers sometimes have a hard time distinguishing whether a certain idiom variant exists and is commonly used or only could exist, i.e., is a priori possible, but is not documented. This is so because the spontaneous formation and learning of novel idiomatic expressions is part of speakers' linguistic competence. Also, knowledge of idioms varies considerably among speakers (just like vocabulary knowledge). The survey was complemented by studying the real-life use of idioms (via Google searches), accompanied by consultation of speakers. The results of the survey have reproduced the pattern discovered in the Hebrew survey, distinguishing the verbal passive from the other diatheses.

In §3 we describe an experiment which tested the likelihood of phrasal idioms in the verbal passive, adjectival passive and anticausative to share their idiomatic meaning with their transitive alternant. Again the experiment has reproduced the same pattern of findings singling out the verbal passive as significantly more likely to share its idiomatic meaning with its transitive alternant. §4 offers our analysis of the findings in terms of lexical storage, and §5 evaluates possible alternative analyses.

# **2 The distribution of phrasal idioms across diatheses: A survey**

### **2.1 Method**

We examined the distribution of phrasal idioms across four diatheses, transitive (with an anticausative alternant), anticausative, verbal passive, and adjectival passive. We searched four English idiom dictionaries, looking for "unique" idioms (as defined in 6).<sup>4</sup>

<sup>3</sup>We use the term anticausative instead of unaccusative to emphasize that for the purposes of this study it is crucial that these predicates have a transitive alternant in the language, while the question as to whether or not they involve an unaccusative syntax is not directly relevant. <sup>4</sup>The English dictionaries we used are listed in the references section (see Ammer 2013; White

<sup>1998;</sup> Heacook 2003; Spears 2006).

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

(6) Uniqueness

An idiom is unique to a given diathesis α, if α does not share the idiom with its (existing) root-counterpart in diathesis β, which α would most directly be related to by derivation. Specifically,


Lists of 60 predicates of each diathesis were composed based on the lists of predicates used by Horvath & Siloni (2009). In the Hebrew version of the survey, predicates were sampled quasi-randomly: The sample for each diathesis consisted of the first 60 verbs in a verb dictionary that had the relevant alternant (anticausative for transitive, and transitive for the other diatheses). In the current survey, we used the English translations of these predicates that did not violate the "alternant" criterion. Items that did violate it were replaced with suitable randomly chosen English verbs. For the full lists of predicate samples see Appendix A. For each diathesis, the number of predicates out of the sample of 60 giving rise to unique phrasal idioms was counted. This was done by searches of the idiom dictionaries, followed by Google searches to check occurrences of relevant root-mate idioms, and consultation of native speakers regarding the results.

The categorial nature of the passive form in idioms was determined by inserting it in contexts permitting only adjectives or only verbs, thereby diagnosing categorially ambiguous forms (see Wasow 1977). Specifically, the diagnostics we have used are the following: First, adjectival but not verbal passives can be inserted as complements to predicates such as *seem*, *appear*, *sound, become,* and *remain*, which select an AP complement, but not a VP one. Second, verbal (eventive) passives but not adjectival (stative) passives can occur in the progressive, be modified by adverbials of duration (such as "in a few minutes") and rationale clauses. These diagnostics are illustrated in (7–8) below.

	- b. # The agreement was written in stone in a few hours/to make people respect it.
	- b. The beans were spilled in a few minutes/in order to attract attention.

### 19 The verbal passive: No unique phrasal idioms

### **2.2 Results**

As shown in Table 19.1, transitives, anticausatives and adjectival passives exhibited unique idioms, just like their Hebrew counterparts. Examples of unique anticausative (9), adjectival passive (10), and transitive (11) idioms are given below. Notice that the nonexistent version of the idiom would make a plausible idiom (in terms of its form, meaning and usability), that is, there is no principled reason why it does not exist. The full list of predicates and examples of unique idioms that they occur in are given in Appendix A.

> Table 19.1: Distribution of anticausative, adjectival passive and transitive in unique idioms


	- b. Transitive # burst something/someone at the seams
	- b. Transitive # feed someone to the gills
	- b. Anticausative # the bank broke

However, unlike in Hebrew, the verbal passive in English seems, prima facie, to present unique verbal passive idioms for two out of the 60 predicates, namely

Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

for *caught* and *bitten*. These idioms are given in (12). The combination *the x bug* is instantiated by versions such as *the travel bug* or *the acting bug*.

	- b. bitten by the *x* bug 'having the need/desire/obsession for something'

These phrasal idioms can be suspected at first to constitute unique verbal passive idioms: They are listed in idiom dictionaries in the passive form (and not in the active, in contrast to the norm of listing verb phrase idioms in dictionaries in the active form). Moreover, according to native speakers, these forms can be modified by adverbials of duration or appear in the progressive, suggesting that they have eventive, verbal occurrences.<sup>5</sup>

However, on closer examination, both of these turned out not to constitute true counterexamples to the generalization that there are no idioms unique to the verbal passive. Starting with (12a), the idiom *caught in the crossfire*, in fact, is attested – based on Google searches accompanied by native speakers' judgments – also in the transitive (active) form, as in (13), for instance.

(13) a. This *caught him in the crossfire* between radical proponents of independence and French opponents of anti-colonialism.

(Scheck 2014: 282)

b. … the Israeli–Palestinian conflict, which has often *caught them in the crossfire*. https://goo.gl/f2FbbG

As for the idiom in (12b), again, Google searches turn up a significant number of active transitive examples (e.g. 14–15).


All online examples in this paper accessed 28 January 2017.

<sup>5</sup>Below are two online examples of the idioms in (12a,b) in the verbal passive.

<sup>(</sup>i) … parents and staff were concerned that faith schools were being "caught in the crossfire" between Ofsted and the Department … https://goo.gl/jqwnDD

<sup>(</sup>ii) During one of these journeys, I was firmly bitten by the travel bug … https://goo.gl/ZR6BJj

### 19 The verbal passive: No unique phrasal idioms

The listing of (12a,b) in the passive participial form may well be due to the fact that in addition to occurring as a verbal (eventive) passive, they are also attested in the adjectival (stative) passive; the latter point is demonstrated by the fact that the idioms occur as complements of verbs selecting APs but not VPs, such as *seem* and *remain* (Wasow 1977), as illustrated by (16–17).


We can thus conclude that the idioms in (12) are not exceptions to the generalization that while there are idioms unique to the transitive, anticausative and adjectival passive diatheses, there are no idioms unique to the verbal passive.<sup>6</sup>

It is important to note that the fact that the transitive, anticausative and adjectival passive can head unique idioms does not mean that they only occur in unique idioms. In other words, while the verbal passive must share its idioms with its transitive alternant, the other diatheses need not, but can share their idioms with their transitive alternant. Among the 60 predicates of each diathesis in our sample, 35 verbal passive predicates occur in idioms available also for the active, 17 predicates exhibit sharing of idioms between the transitive and anticausative, and 21 predicates show sharing across the transitive and adjectival passive.

### **2.3 Statistical analysis**

The results including those of the verbal passive are summarized in Table 19.2. To evaluate the significance of the relationship between the classification of a predicate as a verbal passive and its ability to head a unique idiom, we first performed two chi-square tests for independence using Bonferroni correction for multiple comparisons (X2), once on the entire data set (as in Table 19.2 and Figure 19.1) and once on a data set consisting of the transitives, anticausatives and adjectival passives (excluding the verbal passives, as in Table 19.1).

The results indicate that classification as a verbal passive indeed plays a central role in determining whether it can head a unique idiom: while the test performed on the entire data set (i.e. including the verbal passives) found that there

<sup>6</sup>Additional idioms (headed by predicates not included in our sample) that may be suspected to be unique verbal passive idioms are discussed by Horvath & Siloni (2019), and are shown to also conform to the generalization that the verbal passive cannot head unique idioms.


Table 19.2: Distribution of anticausative, adjectival passive, transitive and verbal passive in unique idioms

is a significant relationship between diathesis and participation in unique idioms ( <sup>2</sup> = 19.7, < 0.001, corrected), the one performed on all the diatheses except for the verbal passive failed to find such a relationship ( <sup>2</sup> = 6.7, = 0.07, corrected).

Figure 19.1: Distribution of diatheses in unique idioms

In sum, while the verbal passive cannot head unique phrasal idioms, the adjectival passive, anticausative and transitive can do so. Before turning to the account we advance for these findings, we first discuss an experiment we ran in order to further test this pattern of distribution, and establish the significance of its results.

## **3 Psycholinguistic evidence**

### **3.1 Prediction**

In order to further investigate the phenomenon of lack of phrasal idioms unique to the verbal passive, we ran an experiment aiming to examine speakers' competence in the domain of idiom distribution. We adopted the experimental design

### 19 The verbal passive: No unique phrasal idioms

put forth by Siloni et al. (2018), which tested competence based on invented idioms in order to circumvent speakers' probable acquaintance with idioms in their mother tongue. We composed idioms in English inspired by existing Hebrew idioms and taught them to native English speakers. After learning and assimilating the new idioms, participants were tested on their intuitions about the likelihood of idioms in the verbal passive, adjectival passive and anticausative to share their idiomatic meaning with their transitive alternant. Our prediction was as follows. If the findings discussed in the previous section indeed represent a linguistic pattern, then the experiment should show a significant difference between the verbal passive and the other diatheses regarding their likelihood, as perceived by native speakers, to head unique idioms.

### **3.2 Participants and method**

Participants included 36 native English speakers, 28 female and eight male. 33 were monolingual while three were bilingual with a native or native-like knowledge of Bengali, Russian and Spanish (self-proclaimed). Their ages ranged between 18 and 32 (mean age 21.6). All participants had at least 13 years of education. None had linguistic education concerning the subject matter of this study. Participants were recruited in class or via recruitment ads and consisted of American students at MIT, Brown University and Wellesley (MA) College. After participating in the experimental sessions, participants received a \$20 participation fee.

### **3.2.1 Stimuli**

We composed 12 English idioms inspired by existing Hebrew idioms: four headed by a verbal passive predicate, four headed by an adjectival passive, and four headed by an anticausative. All predicates had a transitive alternant, and all idioms had a plausible transitive version, as judged by six speakers, and had no similar idiom in English. Adjectival passive predicates were those formed by the suffix *-en*, which disallows a verbal reading (e.g. *shaven*). Verbal passive predicates were formed by dative verbs, which allow formation of passives that are unambiguously verbal.<sup>7</sup> The full list of invented idioms, including their Hebrew source

<sup>7</sup>Levin & Rappaport (1986) put forward the "sole complement generalization" (SCG), which states that an adjectival passive of a dative verb is possible only if its formation involves externalization of the argument that is able to stand as the sole realized complement of the verb. Externalization refers to the mechanism turning an internal argument of the input verb into the argument that the adjectival passive modifies or is predicated of. Thus, for instance, the Theme of the verb *sell* can be its sole complement (i) and therefore the adjectival passive in (ii)

Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

of inspiration, interpretation, and example of usage, is given in Appendix B (Form 1).

### **3.2.2 Design**

The experiment proceeded in two sessions. In the first, idioms were taught based on a list of idioms including their respective interpretations and examples of usage (henceforth: the teaching session). In the second session the idioms previously taught were reviewed, and participants were asked to complete three questionnaires: first, a multiple-choice comprehension questionnaire, second, a completion questionnaire, third, the target questionnaire, in which participants were asked to rate the likelihood that the transitive version of the idioms exists (henceforth: the practice and testing session). For instance, the (invented) anticausative idiom *drown in the trash can* (18a) was associated with the interpretation in (18b) and usage example in (19). The comprehension task, the completion task and the experimental task for this idiom are given in (20), (21), and (22), respectively. The comprehension task required choosing the correct response out of three options: a literal interpretation (1) in (20), the idiomatic interpretation we associated with the idiom (2) in (20), and a wrong (but contextually plausible) idiomatic expression (3) in (20).

For all the idioms used in the experiment see Appendix B (Form 1).

(18) a. drown in the trash can

b. Interpretation: 'get fooled'

(19) Usage example

Alice really enjoys playing practical jokes on her friends and family. They are all pretty gullible, but her favorite victim is her little sister who somehow manages to drown in the trash can each and every time Alice tries to set her up.


is possible. In contrast, the Goal cannot be the sole complement of *sell* (iii); hence, the adjectival passive in (iv) is ruled out. It then follows that any expression of the form in (v) (where adjectival passive formation would violate the SCG) can only be verbal.

<sup>(</sup>v) … was sold something

### 19 The verbal passive: No unique phrasal idioms

### (20) Comprehension

A: Why are you so angry?

B: It's my annoying sister with her practical jokes. I feel so stupid. This time she told me she got married in Vegas, and I was stupid enough to drown in the trash can and call every member of our family.

In the dialogue above, when B says "drown in the trash can", she means that:

1. She got so excited she walked right into a bin full of garbage

2. She got fooled

3. She got upset

(21) Completion

Complete the following: She drowned in the

(22) Experimental task

You have learned the idiom 'drown in the trash can'. How likely (from 1–5) does it seem to you that the following idiom exists as well?

'drown someone in the trash can'

### **3.2.3 Procedure**

As mentioned above, the experiment included two sessions. In the teaching session, the instructor first explained to the group of participants that they were about to learn invented idioms on which they would be asked questions in a following meeting. The instructor then distributed the list of idioms, interpretations and usage examples (Form 1, see Appendix B), and taught the idioms by reading each idiom aloud in various tenses (to assure participants do not assume their tense is fixed), along with its meaning and example of usage. Participants were then asked to go over the idioms again before the second meeting. The practice and testing session took place three days later. The instructor made sure each participant had a copy of Form 1 and slowly read its contents aloud. Participants were then asked to return the form and were given the comprehension questionnaire (Form 2, Appendix B), the completion questionnaire (Form 3, Appendix B), and the experimental questionnaire on which participants rated the likelihood of an idiom's transitive version to exist (Form 4, Appendix B), printed side down. Participants were instructed to first fill in Forms 2 and 3, and only then proceed to Form 4 (experimental questionnaire). The instructor made sure this process was indeed executed in the specified order and that participants did not look at a previous form once they continued on to the next one.

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

Data from one participant who had a total of three errors altogether in the two practice forms were discarded, given that the task required assimilation of the idioms. Data from six additional participants, who had one error altogether were included, assuming that one error does not cast doubt on the speaker's knowledge of the learned idioms.

### **3.3 Results**

Figure 19.2 shows the mean acceptance ratings of the transitive counterpart per diathesis.

Figure 19.2: Mean acceptance ratings of transitive counterpart by diathesis (error bars represent standard deviation)

We used the lmerTest package in R (Kuznetsova et al. 2015) to fit a mixed effects model to our data, with ratings of the transitive version's likelihood as the dependent variable, diathesis of taught idiom as the fixed factor, and participants and items as random effects.

Following Barr et al. (2013), we started out by running a maximal model including subject and item random intercepts and a random slope for the fixed effect. Due to convergence failure, random slope was removed for items (but not for subjects). This model yielded a significant effect of diathesis<sup>8</sup> ( (2, 14.9) = 13.77, < 0.001).

As shown in Table 19.3, planned pairwise comparisons with an application of a Bonferroni correction for multiple comparisons (X3) revealed that ratings for transitive counterparts of idioms headed by a verbal passive (M = 4.33, SD =

<sup>8</sup>Test-statistics were obtained by the application of the functions ANOVA (for and values evaluating the role of the fixed factor as a predictor) and *difflsmeans* (for estimates, labeled as *β*, standard errors and and values evaluating the difference between conditions).

0.82) were significantly higher than ratings of transitive counterparts of idioms headed by anticausatives (M = 3.19, SD = 1.05) and that ratings for transitive counterparts of idioms headed by anticausatives were significantly higher than those provided for the transitive counterparts of idioms headed by an adjectival passive (M = 2.44, SD = 1.19).


Table 19.3: Planned pairwise comparisons between diathesis of taught idiom

In sum, the transitive counterparts of verbal passive idioms were rated as significantly more likely to exist than those headed by anticausatives and adjectival passives. In addition, unlike in the Hebrew experiment, the transitive alternants of anticausative idioms were rated as significantly more likely to exist than those of idioms headed by adjectival passives.

## **4 Discussion**

### **4.1 Support for Horvath & Siloni's approach**

The survey of English idiom dictionaries has shown that phrasal idioms distribute differently in the verbal passive diathesis versus the transitive, anticausative and adjectival passive diatheses: while they cannot be unique to the verbal passive, they can be unique to the latter diatheses. The experiment we conducted further supports this pattern of distribution: participants perceived the likelihood of the verbal passive to share idiomatic meanings with its transitive counterpart as significantly higher than that of both the adjectival passive and the anticausative. That is, the likelihood of the verbal passive to head unique idioms is significantly lower than that of the two other diatheses. Both the survey and the experiment reveal that the distribution of idioms is sensitive to the diathesis of their respective head. This sensitivity reinforces the claim that idioms are stored as linguistic knowledge (i.e., intra-grammatically), since otherwise there would

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

be no reason for them to be affected by a grammatical factor such as the diathesis of their head.<sup>9</sup>

Further, the existence of unique idioms in the transitive, anticausative, and adjectival passive shows that idioms are not stored in the lexicon under the root of their head, i.e., as subentries of the bare root. If they were, we would erroneously predict prevalent idiom sharing across diatheses, that is, idiomatic meanings would have to be shared by the various diatheses generated by the same root (under which idioms would be stored), except for gaps due to independent reasons. Likewise, if phrasal idioms were stored under the root of their head, all other things being equal, we would erroneously expect the anticausative, adjectival passive, and verbal passive to be conceived by speakers as equally likely to share idiomatic meanings with their transitive alternants. (Recall that the transitive alternants of the experimental items were judged as potentially possible idioms.)

Following Horvath & Siloni (2009; 2019), we derive the distributional distinction between the verbal passive and the other diatheses, as revealed by both the survey and the experiment, from the distinct storage technique available to them. Consider first the options mentioned in the literature as to how idioms are stored. On the one hand, it has been suggested that idioms are stored as independent ("big") lexical entries (e.g., in psycholinguistic studies by Bobrow & Bell 1973; Swinney & Cutler 1979; Gibbs 1980). On the other hand, the idea that idioms are stored as subentries of other existing entries has also been entertained: it has been suggested that they are stored by multiple storage, that is, as subentries of the lexical entries of each of their constituents (Everaert 2010), or as subentries of the Encyclopedic entries of their constituents (Harley & Noyer 2003). It has also been proposed that they are stored under the lexical entry of the head of the idiom only (Baltin 1989; Horvath & Siloni 2009).<sup>10</sup>

<sup>9</sup>As observed by Siloni et al. (2018), one could suggest that the learnt idioms in the experiment were not stored in the participants' lexicon, as would be assumed under circumstances of natural learning, but rather placed in some short term storage, given that the exposure and learning take place in an experimental setting. A priori, this could be a possible alternative hypothesis. However, this short term storage would by assumption be outside the grammar's storage component. Thus, adopting this hypothesis would leave us with the question of why the results of the experiment turn out to pattern the way they do and specifically why they manifest sensitivity to diathesis. The short-term storage hypothesis in itself does not offer any account of the pattern of behavior revealed in the experiment. Moreover, the fact that the findings reproduced the pattern revealed by the survey of (existing) idioms would then be a coincidence.

<sup>10</sup>Analyses of idioms claiming that the head of the idiom selects the other subparts of the idiom via a dependency between heads are common in earlier literature, and include a variety of otherwise different approaches, such as Bresnan (1982), Erbach (1992), Koopman & Sportiche

### 19 The verbal passive: No unique phrasal idioms

Horvath & Siloni (2009; 2019) observe that the sensitivity of Hebrew phrasal idioms to the diathesis of their head can be accounted for under the assumption that they are stored as subentries of their head.<sup>11</sup> If they were stored as independent entries of their own, there would be no reason why their ability (in case of existing idioms) and likelihood (in case of invented idioms) to exist as unique idioms should depend on the specific diathesis of their head. They could be stored as entries independently of whether their head is a verbal passive or a predicate in some other diathesis. On the other hand, under the assumption that they are stored as subentries, there must exist a lexical entry whose subentries they can be. Thus, if their head turns out not to be a lexical entry, evidently they cannot be stored as its subentries, as further explained directly.

In the linguistic literature, it is a widely held view that the verbal passive is not formed in the lexical component, but is derived by the computational system post-lexically (Baker et al. 1989; Collins 2005; Horvath & Siloni 2008; Meltzer-Asscher 2012, among others). Being a post-lexical output, the verbal passive is, reasonably, not stored in the lexical component. It follows that the verbal passive cannot have subentries. Hence, an idiom in the verbal passive cannot be stored directly under the entry of its head, and thus cannot be unique to the diathesis. A verbal passive idiom must share its idiomatic meaning with its transitive version, which is stored under the transitive entry. Post-lexical passivization of the transitive idiom is what produces the verbal passive version.<sup>12</sup>

Under Horvath & Siloni's (2009; 2019) approach, the transitive, unaccusative (in our present terminology, anticausative), and adjectival passive, unlike the verbal passive, are entries in the lexicon. It then follows that an idiom may be

<sup>(1991),</sup> and O'Grady (1998). These studies do not address the manner of storage, and are thus not explicit as to whether they propose that idioms are listed exclusively as subentries of the entry of their head. Yet based on their emphasis on head-on-head dependency and the parallels between their accounts of idioms and other instances of "selection"/"subcategorization" (in the terminology of Chomsky's (1965) Aspects model), it is reasonable to assume that these proposals implicitly adopt a head-based storage for idioms as well.

<sup>11</sup>The question as to whether there are good reasons to believe that they are stored in addition under the heads of the other constituents in the idiom is orthogonal to our discussion and will not be examined here.

<sup>12</sup>We, correctly, do not predict the automatic existence of a verbal passive version for every transitive idiom. Since verbal passives are derived post-lexically, the question of whether or not a transitive idiom will exist in the verbal passive depends on whether the idiom is able to undergo passivization resulting in a well-formed output. This in turn involves interpretive factors, such as whether the idiom chunk to become the derived subject of the passivized idiom has the appropriate properties, e.g., referentiality, to be compatible with the information structure consequences of being in subject position (for discussion, see Ruwet 1991). See also Nunberg et al. (1994) for discussion of the question of which idioms in the active can be passivized.

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

stored under each of them, thereby enabling the existence of idioms unique to the diathesis. Whether or not these diatheses are indeed lexical entries is debated. But there are studies advocating this claim on independent grounds. Horvath & Siloni (2011) and Reinhart (2003) argue that transitives are lexical entries, which can be the input for additional diathesis derivations. Further, there are studies claiming that unaccusatives and adjectival passives are derived by a lexical operation (see Chierchia 2004; Horvath & Siloni 2011; Levin & Rappaport Hovav 1995; Koontz-Garboden 2009; Reinhart 2003, for unaccusatives, and Horvath & Siloni 2008; Levin & Rappaport 1986; Meltzer-Asscher 2011, for adjectival passives). Assuming with Horvath & Siloni (2008; 2011) that these lexical outputs are stored in the lexicon (more generally, that the adjectival passive and anticausative are lexical entries), it becomes clear why they can have unique idioms. Being lexical entries, they can have their own subentries including idioms unique to them.

If indeed these diatheses are derived by lexical operations, it means that the lexicon must be an active component of grammar (as argued by Reinhart 2003, Siloni 2003 among others), and not a mere storehouse of atomic items, in contrast with the view of syntacticocentric approaches (Borer 2005; Marantz 1997; Pylkkänen 2008; Ramchand 2008, among many others).

Recall, in addition, that the transitive alternants of anticausative idioms were rated in the English experiment as significantly more likely to exist than the transitive alternants of those headed by adjectival passives. Obviously, the transitive and adjectival passive differ in category (verbal vs. adjectival, respectively), unlike the transitive and anticausative, which are both verbal. One could then suggest that when sharing across diatheses involves a change in lexical category, participants perceived it as less likely to exist. However, the findings regarding Hebrew cast doubt on this suggestion, as such a distinction was not found in the Hebrew experiment, where the adjectival passive and the anticausative showed no significant difference regarding the likelihood to share a transitive alternant. Moreover, such significant effect between the anticausative and adjectival passive was not found in the English survey either. And, in fact, the number of adjectival passives heading shared idioms (21/60) in English is even a bit larger than the number of anticausatives heading shared idioms (17/60). That is, there does not seem to be a systematic distinction between the ability and likelihood of the adjectival passive to share idioms with the transitive and those of the anticausative. We thus leave this issue open for further experimentation.

In sum, assuming that (i) the verbal passive is not stored in the lexicon (since it is formed post-lexically), unlike the other diatheses, and (ii) phrasal idioms are stored as subentries of their head, the distinct distribution of phrasal idioms across diatheses follows. The verbal passive, unlike the other diatheses, cannot

### 19 The verbal passive: No unique phrasal idioms

head unique idioms as such idioms would be unable to be stored, given that the verbal passive is not a lexical entry.

This approach indeed accounts for the findings. But other approaches seem a priori plausible too. We examine them in the next section.

## **5 Alternative approaches**

As observed by Siloni et al. (2018), one might try to suggest that the pattern revealed by the survey and experiment could be the reflection of the productivity found at the diathesis level, that is, that the results reflect inheritance from the verb level to the idiom level. More specifically, the same way that there are no verbal passives that lack a corresponding transitive alternant, there are also no verbal passive idioms that lack a transitive counterpart. And the same way that there are sporadic gaps in the unaccusative (anticausative) alternation – certain unaccusative verbs idiosyncratically lack a transitive counterpart in a given language, and vice versa – anticausative and transitive idioms can similarly lack the relevant alternant. Indeed, in the case of the verbal passive, an approach of inheritance from the verb to the idiom level can be envisioned. However, as far as the anticausative alternation is concerned, this approach is implausible. While uniqueness at the idiom level is a pervasive phenomenon (as shown in Table 19.2 and exemplified in (1–3) and (9–11) above), unaccusative verbs systematically have a transitive counterpart (with a Cause external role) and vice versa, except isolated sporadic gaps (Haspelmath 1993; Reinhart 2003). Idiom distribution across diatheses, therefore, cannot be considered to be a reflection of productivity at the verb level.

A different potential account of the experimental findings could rely on the difference in valence between verbal passives, which have two arguments available (including the implicit external argument), versus anticausative and adjectival passives, which are one-place predicates (but see footnote 13). The contrasting findings of the experiment may then follow, so the "valence" argument would go, from the fact that when participants are asked to estimate the likelihood of the active transitive version based on a verbal passive idiom, they are dealing with predicates of the same valence (both two-place), while in case of having to relate an anticausative or an adjectival passive idiom to a potential transitive version, participants need to convert a one-place predicate into a two-place, transitive version of the same idiom. The addition of an argument necessary in the anticausative and adjectival passive cases but not in the verbal passive may add some extra difficulty to the task, and thus it might be claimed to be the source

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

of the difference in the results found between these diatheses in the experiment. But such an approach would not explain the total lack of unique verbal passive idioms found in the surveys of existing idioms in both English (§2) and Hebrew (Horvath & Siloni 2009). Suppose it is easier to "transitivize" a verbal passive idiom, this still would not explain why the latter always has a transitive alternant.<sup>13</sup>

Within the framework of Distributed Morphology, which has a single structure-building engine, the syntax, Marantz (1997) suggests that the syntactic head introducing the external argument (Agent) is the boundary delimiting the domain of special (idiomatic) meanings. He thus argues that the fact that the verbal passive involves an external argument (explicit or implicit) is the reason why it cannot be associated with special meanings (that is, head unique idioms). The unaccusative, in contrast, lacks an external argument and can head unique idioms. It is, however, not obvious how this line of reasoning can account for the fact that transitive verbs can head unique idioms, although they involve an external argument (see also footnote 13).

The syntactic boundary delimiting the domain for idiosyncratic meanings could then be argued to be higher than the head responsible for the Agent, say the head responsible for the formation of verbal passives. The verbal passive would not give rise to unique idioms because it would be beyond the syntactic domain of special meanings. Such a proposal would be at odds with Arad's (2005) arguments that the domain of idiosyncrasy is the local domain of the root, and this is the domain delimited by the first category-assigning head above the root; the domain of any higher head is argued by Arad to have no access to meanings associated with the root. But if so, then extending the locality domain, trying to cover the split behavior of idioms in the verbal passive versus the other diatheses we examined, seems ad hoc.<sup>14</sup>

<sup>13</sup>In addition, although in the past, it has been assumed that adjectival passives do not involve an implicit external argument, it was shown in recent literature that a subset of the set of adjectival passives does involve an external argument (Anagnostopoulou 2003; Gehrke & Marco 2014; McIntyre 2013; Meltzer-Asscher 2011). In the Hebrew experiment, two adjectival passive idioms are reported to be headed by an adjectival passive involving an external role (*aruz* 'packed', *tafur* 'sewed'). These idioms did not score better than the other two adjectival passive idioms (Siloni et al. 2018). In the English experiment the adjectival passive *shaven* implicates an external role. It did turn out to score better. Thus, no conclusion can be drawn at this point. <sup>14</sup>Under some recent approaches dissociating Voice from *v* (e.g., Harley 2013), the passive Voice head merely captures the absence of the syntactic external argument (failure to merge a DP as its specifier) and is argued not to involve any of the particular semantics of the various "flavors" of *v* (assumed by syntacticocentric approaches). Given such a proposal, one might perhaps think of attributing the absence of unique verbal passive idioms to the Voice head's lack of semantic substance. But the postulation of such a Voice head is not worked out in sufficient detail to permit evaluation of its merits, and its potential ability to account for the idiom data.

## **6 Conclusion**

We have reported and discussed two novel empirical studies, one quantitative survey of idiom dictionaries and one experimental study, examining the patterns of distribution exhibited by phrasal idioms across three diathesis alternations in English. The studies aimed to assess, based on evidence from two distinct sources, what the cross-diathesis distribution of idioms can tell us about idiom storage, about the representation and derivation of these diatheses in the grammar, and consequently about the division of labor between the lexicon and the syntax.

Our investigation dealt with the question of whether there is an asymmetry in the pattern of idiom distribution among the various diatheses, as reported in the literature. Specifically, we investigated whether phrasal idioms in the verbal passive always have a transitive version, that is, cannot be unique to the verbal passive, while the anticausative, adjectival passive and transitive diatheses commonly exhibit idioms specific to the diathesis. The results of our English survey confirmed that the latter diatheses exhibit unique idioms, while the verbal passive always shares its idiomatic meaning with its transitive alternant. The survey's findings thus suggest that the distribution of phrasal idioms depends on the particular diathesis of their head. To further confirm these findings, which were based on the set of existing idioms, we conducted also an experimental study, which tested native speakers' perception of the likelihood of invented phrasal idioms in the verbal passive, the adjectival passive and the anticausative diathesis to share their idiomatic meaning with their transitive alternant. The experimental results reproduced the same pattern of asymmetric distribution as found in our idiom survey. Speakers judged the likelihood of the verbal passive to share idiomatic meanings with its transitive counterpart as significantly higher than the likelihood of the other two diatheses.

The converging findings of these two different studies of the pattern of idiom distribution were argued to follow from the particular storage technique available to phrasal idioms. Specifically, it was suggested that phrasal idioms are stored in the lexicon as subentries of the entry of their head (not as independent entries of their own). This proposal straightforwardly accounts for the lack of unique phrasal idioms in the verbal passive: Since the verbal passive, unlike the other diatheses we examined, is a post-lexical output, which does not have its own entry in the lexicon, it obviously cannot have subentries. Hence, an idiom in the verbal passive cannot be stored directly under its head, and thus cannot be unique to the diathesis. The transitive, anticausative and adjectival passive, in contrast, are entries in the lexicon, and can therefore list unique idioms as their subentries. Our findings provide evidence that the lexicon comprises deJulie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

rived diatheses as lexical entries, rather than roots only. We take this as an indication that the lexicon is an active component of grammar where derivational operations can apply.

## **Acknowledgements**

This research was supported by Grant No. 2009269 from the United States–Israel Binational Science Foundation (BSF). We would like to thank Kayla Gold-Shalev for sampling the verbs and collecting the idioms and Uriel Priva-Cohen for his help with subject recruitment. Thanks also to two anonymous reviewers. Finally, we are grateful to the editors of this volume for the opportunity to make a contribution to this volume in honor of Ian Roberts.

## **References**


### 19 The verbal passive: No unique phrasal idioms


Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler


19 The verbal passive: No unique phrasal idioms


Siloni, Tal. 2003. *Active lexicon*. DOI: 10.1515/thli.28.3.383.


# **Appendices**

In both the survey and the experiment, we took into account the notion of decomposability first defined by Nunberg et al. (1994). In their study of idioms, Nunberg et al. distinguish between "idiomatically combining expressions", in our terms, decomposable idioms, and "idiomatic expressions", in our terms nondecomposable idioms. An idiom is considered decomposable if its structure is isomorphic with its meaning, in the sense that its constituents correspond to elements of its meaning. If not, it is nondecomposable. Nunberg et al. (1994) claim that decomposability is a prerequisite for "syntactic flexibility", such as the ability of subparts of an idiom to undergo movements. If decomposability affects flexibility (movement), it seems relevant for the shift between diatheses.

However, it must first be noted that the claim that decomposability is a prerequisite for the syntactic flexibility of idioms is in fact a rather controversial

### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler

one. Counterexamples seem to be frequently attested; for instance, the following nondecomposable idioms, which nonetheless exhibit diathesis alternation: *break someone's heart* 'sadden, disappoint someone', *open the door to something* 'enable, allow something to happen', which both have an unaccusative counterpart, and *keep tabs* 'observe, follow', which has a verbal and adjectival passive counterpart.

Nonetheless, to be on the safe side, we did try to take this claim into consideration and chose idioms in our survey as well as idioms for the experiment so as to avoid this potentially interfering factor. Our guidelines were as follows. In cases where the lexically fixed subparts would be involved in the relevant diathesis changing operation, we included idioms only if their meaning (interpretation) could be mapped onto their head and its arguments (or modifiers). We did not consider metaphoric paraphrases as interpretations appropriate to determine decomposability. In addition, we did not consider it relevant to have matching between parts of meaning and elements of the internal structure of arguments (or modifiers). For instance, the idiom *burn one's boats/bridges* (item 1, list of unique transitive idioms) as well as *one's red bulb lit up* (item 5, Form 1 of the experiment, Appendix B) are considered decomposable (as schematized here: [<sup>1</sup> *burn*] [<sup>2</sup> *one's boats/bridges*] '[<sup>1</sup> destroy] [<sup>2</sup> options of reversing the situation]'; [<sup>1</sup> *one's red bulb*] [<sup>2</sup> *lit up*] '[<sup>1</sup> one's suspicion] [<sup>2</sup> arose]'). Nondecomposable idioms were freely used when the potential diathesis shift operation would not involve a lexically fixed constituent of the idiom, as for example in *glued to one's seat*/#*glue one to one's seat* (item 6, list of unique adjectival passive idioms, Appendix A).

## **Appendix A Survey**


Table 19.4: Sampled predicates


### 19 The verbal passive: No unique phrasal idioms


### Table 19.5: Unique unaccusative idioms



Table 19.6: Unique transitive idioms


### Table 19.7: Unique adjectival passive idioms

# **Appendix B Forms**

### **B.1 Form 1**

Idioms, interpretations and usage examples (the copies distributed to participants did not include indications of diatheses)

> Form 1: idioms, interpretations and usage examples (the copies distributed to participants did not include indications of diatheses)




### **B.2 Form 2**

### **Comprehension questions**

*Participant no.: Age: Gender: Native English speaker:* yes/no *Are there any other languages you speak? If yes, name these languages and estimate your level of knowledge (poor/good/excellent/native-like):*

### **Please read the following dialogue between A and B and circle the correct answer.**

	- B: I think she's cheating on me. My red bulb lit up when she started wearing perfume to work.

In the dialogue above, when B says "my red bulb lit up", he means that:

	- B: I'm not sure. He works hard but I get really distracted sitting across from him, he is stricken with tremors all the time!

In the dialogue above, when B says he "is stricken with tremors", he means that:

	- B: Well, every evening after dinner he supposedly goes up to his room to study, but when I looked at his report card yesterday I realized I had been sold a stew of lentils.

In the dialogue above, when B says she "was sold a stew of lentils", she means that:

	- B: You leave me no choice. You always say you arrive on time but on five different occasions customers called to complain that the store was closed for at least 45 minutes after opening time. I've been handed soggy bread again and again and I won't have it!

In the dialogue above, when B says "I've been handed soggy bread", she means that:

	- B: It went pretty smoothly until my granddad broke out of his harness when my brother used the wrong fork.

In the dialogue above, when B says her granddad "broke out of his harness", she means that:

	- B: Don't ask, they found out about each other and now I am shaven on both sides of my face.

In the dialogue above, when B says "I am shaven on both sides of my face", she means that:

	- B: B: I don't know the details, but I heard Mary found out she was slipped the smudge by Jennifer and decided never to speak to her again.

In the dialogue above, when B says Mary "was slipped the smudge", he means that:

	- B: No. I promised myself never to bake with you again, after that time you forgot to take the muffins out of the oven and I was given the heavy beam. You never do what you're supposed to!

In the dialogue above, when B says he "was given the heavy beam", he means that:

	- B: I think Maggie will give it to Steve, her former assistant. He has warmed up in her light for many years, which makes him the person she trusts the most.

In the dialogue above, when B says "he has warmed up in her light", he means:

	- B: It's great—at a very central location and only a block away from my work. But the downside is that since it's so close to everything, I get no exercise at all. So I guess I'm sunken in a pit of lard.

In the dialogue above, when B says she is "sunken in a pit of lard", she means that:

	- B: Don't even think about it. He sure looks like a decent guy but my friend dated him for a while so I heard a lot about him. He really seems stained under his skin.

In the dialogue above, when B says he really seems "stained under his skin", she means he really seems:

	- B: It's my annoying sister with her practical jokes. This time she told me she got married in Vegas, and I was stupid enough to drown in the trash can and call every member of our family.

In the dialogue above, when B says "drown in the trash can", she means that:


### **B.3 Form 3**

### **Completion task**


### **B.4 Form 4**

### **Experimental task**

Participant no:

Please answer the following questions:

1. You have learned the idiom 'break out of his/her harness'. How likely (from 1–5) does it seem to you that the following idiom exists as well?

'break someone out of his/her harness'


### Julie Fadlon, Julia Horvath, Tal Siloni & Ken Wexler


How likely (from 1–5) does it seem to you that the following idiom exists as well?

'give someone the heavy beam'


# **Chapter 20**

# **Rethinking the syntax of nominal predication**

# David Adger

Queen Mary University of London

Human languages often disallow bare nominals as predicates. Scottish Gaelic is a particularly striking case, in that it disallows simple nominal predication entirely, using alternative syntactic means to deliver the required meanings. This paper provides an answer both to the larger question of why NP predication is so restricted, and to the more local one of why Gaelic uses the particular syntactic forms it does, based on a principle that regulates the interface between syntax and semantics: syntactic predicates must have open eventuality variables.

# **1 Introduction**

Scottish Gaelic, like Irish, does not allow simple noun phrase predication, of the type one sees in English.

	- b. Anson is a teacher.

This paper finds the reason for this at the interface between syntax and semantics. I propose a general principle regulating predication as follows:

(2) For an XP to act as a syntactic predicate, it must have a semantically open eventuality variable.

I combine this with the proposal, motivated in Adger (2013), that underived nouns are sortal (one place) semantic predicates of individuals, and so never involve an eventuality variable. It follows that an NP can never act as a syntactic predicate.

David Adger. 2020. Rethinking the syntax of nominal predication. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 461–496. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972872

### David Adger

Languages, however, need to express nominal predication, so they get around the strictures imposed by these principles in various ways. I show how Scottish Gaelic uses two distinct strategies for this purpose. The overall conclusion is that universal restrictions at the syntax semantics interface nevertheless leave languages open to a range of syntactic solutions to express thought, leading to restricted variability in how predication is syntactically expressed.

## **2 The basic set of puzzles**

Languages often go out of their way to do something strange when they use projections of nominals as predicates. For example, Scottish Gaelic (and related Celtic languages), allow simple [DP predicate] orders after the finite auxiliary when the predicate is an adjective or a prepositional phrase (Chung & McCloskey 1987):


However, as noted by Adger & Ramchand (2003), the predicate cannot be a nominal:

(5) Scottish Gaelic \* Tha Be.prs Calum Calum oileanach. student intended: 'Calum is a student.'

There are two ways of expressing the English translation in (5) (Cram 1983; Schreiner 2015). In the first, the auxiliary subject predicate structure is maintained, but an apparently prepositional element appears before the nominal (I'll term this the *p-strategy*):

(6) Scottish Gaelic Tha Be.prs Calum I na in.poss.3sg.m oileanach. student 'Calum is a student.'

### 20 Rethinking the syntax of nominal predication

The alternative is to use a clefting structure (the *cleft-strategy*):<sup>1</sup>

(7) 'S cop e it oileanach student a rel th' be.prs ann an in Calum Calum 'Calum is a student.'

In both strategies, the preposition *ann an*, 'in' appears.<sup>2</sup> In the p-strategy, *ann an* inflects as though it were followed by a possessive clitic, taking exactly the same morphological forms that it would in a true nominal:


The second mark of this strategy is that the subject precedes the inflected *ann an*:

(10) Scottish Gaelic Tha Be.prs Calum I na in.poss.3sg.m oileanach. student 'Calum is a student.'

(i) Scottish Gaelic Is cop cat cat Lilly Lilly (archaic) 'Lilly is a cat'

<sup>1</sup>There is, in formal/archaic registers, a third possibility, where a bare copula is used (what Adger & Ramchand 2003 call the *inverted copular construction*, ICC), as in (1). However, for simple nominal predication at least, this is vanishingly rare in normal discourse:

<sup>2</sup>A word on the morphology of this preposition to avoid confusion in interpreting the glosses. The bare form of the preposition used before indefinite NPs and proper names is written as two words *ann an*, pronounced [aʋnən], but before definites or (for some speakers) universals it is *anns*, [aʋns]. It has agreeing forms, e.g. *annam*, 'in me', *innte*, 'in her' and, confusingly, *ann*, 'in him', and it also has special forms it takes before possessive clitics, e.g. *nam*, 'in my', *na*, 'in his/in her' (depending on whether the following noun is lenited (masculine) or not (feminine)), *nar*, 'in our', etc.

### David Adger

(11) Scottish Gaelic \* Tha Be.prs oileanach student na in.poss.3sg.m Calum. Calum intended: 'Calum is a student.'

In the cleft-strategy, in contrast, the preposition appears in its "bare" form, and the apparent subject follows it. The morphology of the preposition here is just what would be expected for prepositions with full DP complements. This observation is further confirmed by the fact that when the subject is a definite DP (that is, when it is headed by the definite article and certain other determiners), the preposition inflects for definiteness:

(12) Scottish Gaelic

'S cop e it oileanach student a rel th' be.prs anns in.def a' the bhalach boy 'The boy is a student.'

Contrary to what we saw with the p-strategy, here the apparent subject follows the preposition and the predicate precedes it. Compare (12) with (13):

```
(13) Scottish Gaelic
```
\* 'S cop e it am the balach boy a rel th' be.prs ann an in oileanach student intended: 'The boy is a student.'

These two strategies might be thought of simply as different syntactic options built on the same core structure, with a prepositional element taking a small clause complement, followed by either subject raising, or A-bar extraction of the predicate:

(14) a. [TP SubjectDP in [SC 〈DP〉 PredicateNP]] b. PredicateNP [CP [TP in [SC SubjectDP 〈NP〉]]

We can call this the *unified small clause* analysis (USC). The USC has two immediate advantages, one analytical and one theoretical: analytically, it straightforwardly captures the odd "flip" of the preposition/subject order, while theoretically it allows one to maintain the idea that the basic "thematic" relation of predication is captured in the same way, with the apparent differences due to surface syntactic effects. This kind of approach, preserving the uniformity of thematic

### 20 Rethinking the syntax of nominal predication

assignment hypothesis (the UTAH of Baker 1988), is familiar from transformational analyses of passive, raising, etc.

A further advantage is that it allows one to say that there is nothing special about NP predication in Gaelic (beyond, perhaps, some statement that a small clause with an NP predicate must have at least one of its constituents "evacuated"). That is, NP predication reduces to the same underlying structure as adjectival and prepositional predication.

However, I'm going to argue against this position and for an analysis that treats these two strategies as derivationally unrelated. I'll argue on the grounds of interpretational differences between the two strategies that the p-strategy involves co-opting an aspectual functional category from the verbal domain to license the subject, while the cleft-strategy involves the syntax of property inclusion. In both cases the functional category that is spelled out as the (sometimes reduplicated) preposition *ann an*, 'in', is interpreted as a kind of inclusion: either an individual is included in a stative situation, or a property is included in a set of properties. However, these are fundamentally different relations, both syntactically and semantically. The coincidence in form is metaphorical, not theoretical. We can call this approach the syntax–semantics interface approach (SSIA).

The analytic problem of the preposition/subject order is solved in the SSIA by taking the two structures to be generated differently. On the theoretical level, this proposal actually pushes the syntax–semantics connection deeper than a UTAHstyle formulation: it connects the syntax, not just to the semantics of nominal predication, but rather to different fine-grained semantic types of predication.

I'll propose that the two different strategies are distinct solutions to a fundamental and uniform constraint on the syntax/semantics of nominals: they simply cannot have a syntactic subject (cf. Baker 2003). Adger (2013) proposes that when arguments are introduced as specifiers of a lexical category they can only be so introduced via an event variable (cf. Kratzer 1996). Only functional categories in the extended projection of verbs have this capacity, so nominals must take other routes to be associated with arguments. One route that Gaelic takes is to coopt stative aspect from the verbal extended projection, and to use this stative functional category to introduce a subject. The other route is to use the syntax of property inclusion, so that the apparent subject is a higher level predicate, an analysis motivated by the syntax of clefts in Gaelic in general (Adger 2011b).

I contrast this approach with that offered by Schreiner (2015). Schreiner argues for a uniformly nominal syntax for the p-strategy, building on the theory presented in Roy (2006), which takes nominals to be endowed with an event variable. This closes off a solution to the deeper problem about why the p-strategy

David Adger

exists in the first place, and why a simple nominal predication structure is impossible in Gaelic. I also argue that the syntactic empirical data favours an account of the p-strategy that takes it to have a distinct syntax from true nominals.

# **3 A unified small clause style analysis**

I first sketch out, and then dispense with, a unified syntactic analysis of the two constructions. In this analysis, the particle *ann an* can be taken to be an aspectual particle (Cram 1983), with the subject raising to some position just below the finite auxiliary, which I take to be in Fin (see Roberts 2005 for Welsh, Adger 2007 for Gaelic). I revisit the PredP status of the lowest constituent here directly:

The idea that the particle here is aspectual fits well with the functional inventory of the language, which marks perfect, progressive and prospective aspect via preposition like elements that appear between the subject and the verb phrase:


20 Rethinking the syntax of nominal predication

(18) Scottish Gaelic Tha Be.prs Calum Calum gu asp òl. drink.vn 'Calum is about to drink.'

Furthermore, a small class of verbs, mainly verbs of position, have exactly the syntax of these predicate nominals: after the subject we find *ann an* inflected as though it were followed by a possessive clitic, further followed by the non-finite verbal form. It seems but a short step to take the preposition both in these verbs and in the predicate nominal construction to be marking a certain kind of stative aspect (this is essentially an updating of the analysis presented in Cram 1983 and adopted by Schreiner 2015):


The idea that the prepositional element in the p-strategy is aspectual seems well motivated.

The agreement on Asp (*ann an*) is obligatory and marks the φ-features of the subject, which would follow if we stipulate that Asp in this language bears agreement features and agrees with the subject. Under such an analysis, the possessive clitic is agreement triggered by movement of the subject, making it parallel to the Romance participial agreement systems discussed by Kayne (1993): agreement is obligatorily triggered when a DP moves through Asp's specifier.

(21) Scottish Gaelic

\* Tha Be.prs Calum Calum ann an in oileanach. student 'Calum is a student.'

(22) Scottish Gaelic

Tha Be.prs Calum Calum na in.poss.3sg.m oileanach. student 'Calum is a student.'

### David Adger


Why should the subject raise? We could either take this to be due to some property of T (a case or extended projection principle (EPP) related property as in Roberts & Roussou 2002), or we could assume, with Chomsky (2013), that the lowest level, where the predication takes place, is not well formed, as there is no head to provide a label. One might follow Chomsky and Moro (1997), dispensing with the Pred structure, and taking the categorial label PredP to be unneeded. Chomsky takes such {XP, YP} structures to be inherently unstable, forcing movement of one of the subconstituents.

Once the subject (*mi*) has raised to the specifier of TP, its trace is not counted for the calculation of labels, so the XP receives the same label as the nominal *oileanach* (N).

The cleft-strategy could be then taken to involve the same underlying structure, but with movement of the predicate NP as opposed to the subject, as follows:

We have seen that if the DP subject moves, we have the p-strategy. If the DP subject stays in situ, the predicate NP must move, on a Moro/Chomsky type analysis. That will derive movement of the predicate NP, but leaves open the question of why Asp does not agree in the cleft-strategy, and why the predicate A-bar extracts, rather than moves to the specifier of TP.

On the first of these, predicates in Gaelic do not, in general, enter into morphosyntactic agreement with their subjects, so we find different inflection on attributive vs. predicative adjectives, with only the former infecting for agreement:

(27) Scottish Gaelic


Since predicates do not enter into agreement, Asp will not agree when the predicate is extracted across it, presumably because the nominal predicate does not, in fact, bear a full set of φ-features.

The noun does agree with its subject in number, as we can see in examples like the following:

(28) Scottish Gaelic

Tha be.prs na the caileagan girls nan in.poss.3pl oileanaich students 'The girls are students.'

### David Adger

However, this agreement is semantic, not syntactic, as can be seen in the use of a singular predicate nominal with plural morphosyntactic agreement connected to honorificity. Just as in languages like French, the plural of the second person is used to mark respect, but the nominal in such cases shows number marking which is dependent on the plurality of the semantic referent (in this case singular).

(29) Scottish Gaelic Tha be.prs sibh you nur in.poss.2pl oileanach student 'You are a student.'

I return to the importance of the semantic interpretability of number on these nominals in adjudicating between different analytical approaches to this construction below.

We then need an extra stipulation to force further A-bar extraction into a cleft structure. We do not find predicate adjectives or prepositional phrases in subject position in Gaelic (that is, immediately following the finite auxiliary). If that generalisation is stated across the semantic category of predicate, rather than the syntactic category of nominal (so Gaelic would not allow the kind of inversion of predicate to subject, discussed by Moro (1997) or den Dikken (2006)), that would rule out the following example (I return to this example below – it is not as innocuous as it appears):

(30) Scottish Gaelic

\* Tha be.prs oileanach student ann an in Calum Calum intended: 'Calum is a student.'

The predicate NP cannot move to the specifier of T: the predicate's φ-features are not sufficient to allow the kind of feature sharing that Chomsky's system requires for specifier licensing. In such a derivation TP would never be labelled.

We can however, allow the predicate to be directly A-bar extracted from its base position, giving the relative clause portion of (31) the structure in (32):

(31) Scottish Gaelic

's cop e it oileanach student a rel th' be.prs ann an in Calum. Calum 'Calum is a student.'

### 20 Rethinking the syntax of nominal predication

The full cleft structure would then incorporate this relative clause as a subpart.

This analysis seems fairly well motivated, and it captures the apparently similar thematic relationship between the two alternative ways to express NP-predication. However, it turns out that there are consistent semantic differences between the two strategies, suggesting that the underlying configuration of the predication is different in the two cases, as opposed to just the surface structures. The syntactic analysis just sketched does not lead to the expectation of such differences, and so I propose an alternative.

## **4 A syntax/semantics interface analysis**

There are interesting semantic differences between the p-strategy and the cleftstrategy, which are not connected to the information structure/focus properties associated with clefts. The differences are somewhat subtle, but also familiar from NP-predicate constructions in other languages (see, for example, Roy 2006).

The first is the oddness of (33), compared to (34):

```
(33) Scottish Gaelic
```
?\* Tha be.prs Lilly Lilly na in.poss.3sg.f cat cat 'Lilly is a cat'

(34) Scottish Gaelic

'S cop e it cat cat a rel th' be.prs ann an in Lilly Lilly 'Lilly is a cat.'

Roughly, the p-strategy is used when the assertion made by the predication is assumed to be non-permanent. (33) improves, for example, if we add an adjective

### David Adger

that restricts the predicate in a way that is sensible for a predicate which holds only temporarily (see also Schreiner 2015 for more detailed discussion and further examples):

(35) Scottish Gaelic Tha be.prs Lilly Lilly dìreach just na in.poss.3sg.f cat cat òg, young, an now drasta. 'Lilly is just a young cat now.'

This semantic restriction is why occupations (loosely construed) tend to be the only class of nouns used in the p-strategy in everyday discourse. NPs denoting occupations are easily understood as temporary properties of individuals:

(36) Scottish Gaelic Tha be.prs mi I nam in.poss.1sg òraidaiche. lecturer 'I am a lecturer'

The effect is more striking when we use the two strategies to make claims about class inclusion. It is simply impossible to use the p-strategy to express such propositions:<sup>3</sup>

(37) Scottish Gaelic \* Tha be.prs (an) (the) iolaire eagle na in.poss.3sg.m eun bird intended: 'The eagle is a bird/An eagle is a bird.'

(38) Scottish Gaelic

's cop e it eun bird a rel th' be.prs anns in.def an the iolaire eagle / / a rel th' be.prs ann an in iolaire eagle 'The eagle is a bird/An eagle is a bird.'

(i) Scottish Gaelic Is cop eun bird (an) (the) iolaire. eagle 'The eagle is a bird/An eagle is a bird.'

I return to this in §6.

<sup>3</sup> It is possible to use the ICC construction, as in (i) (see footnote 1). However, the cleft construction is much preferred in normal discourse:

However, whereas the p-strategy is restricted in this way in its interpretation, the cleft-strategy is not. So it is perfectly well formed to use the cleft-strategy to express class inclusion, as well as predication involving occupations:

(39) Scottish Gaelic 's cop e it òraidaiche lecturer a rel th' be.prs annam in.1sg 'I'm a lecturer.'

That this semantic difference at least partially tracks the syntactic difference between the two strategies suggests that it would be profitable to link the syntax and semantics tightly here. In contrast to the proposal sketched in the previous section, where the underlying structures for the two strategies are the same, with movement operations driving the surface differences, I suggest instead that there are two distinct ways of constructing nominal predication, correlating with the distinct interpretations that these structures have.

Take first the p-strategy:

(40) Scottish Gaelic

Tha Be.prs Calum I na in.poss.3sg.m oileanach. student 'Calum is a student.'

I propose that the p-strategy does indeed involve an aspectual particle, which combines with a stative category, denoted by functional structure containing the nominal. Schematically:

The agreement on the aspectual particle is dealt with as before. I motivated in the last section, the idea that the P in these structures is an aspectual particle,

### David Adger

keyed to the aktionsart of its complement, and I will further motivate this idea below. The noun 'student' here, we shall see, cannot have much in the way of functional structure built above it. Following Adger & Ramchand (2003), I take it to denote a property.

The cleft-strategy, on the other hand, involves a "higher" level kind of predication:

(42) Scottish Gaelic

's cop e it oileanach student a rel th' be.prs ann an in Calum Calum 'Calum is a student.'

I suggest for this structure that the "subject" is actually the NP *oileanach*, 'student' and that the predication asserts that this is in the set (of sets) denoted by the DP *Calum* (under the generalized quantifier denotation of Calum), extending the proposals in Adger (2011b).

Schematically we have:

```
(43) [CopP cop it student] [CP [ 〈student〉 in Calum ]]
```
Following Adger & Ramchand (2005), the apparent expletive is treated as the predicate of the copular clause, with the meaning of the relative CP being substituted for it during the interpretation procedure.

These two structures give us a hook with which to capture the different meanings of the p- and cleft-strategy, in that the underlying predicational relations are differently represented. The p-strategy involves a kind of stative predication while the cleft-strategy involves property inclusion. I work out the details in the next section.

Before turning to the details and the more general implications, however, it is necessary to show how this analysis I have just suggested is implemented syntactically.

## **5 Motivating the interface analysis: The p-strategy**

As mentioned above, the syntax of p-strategy NP predication constructions is shared by the syntax of certain verbs of position. Typically, grammars of Gaelic list nine or ten such verbs in common use, including *suidh*, 'sit, *seas*, 'stand', *duisg*, 'awaken', *caidil*, 'sleep', *laigh*, 'lie down' etc, although there are others which are rarer. Each of these verbs actually signifies a state transition when used in the

### 20 Rethinking the syntax of nominal predication

simple past, and they can all occur with the simple aspectual particle *ag*, which marks an overlap between speech and event time, with no temporal terminus to the event time (see Adger 1996; Ramchand 1997).<sup>4</sup>

	- a. Shuidh sit.pst mi I 'I sat (down).'
	- b. Bha be.pst mi I a' simp suidhe sit.vn 'I sat/was sitting.'
	- c. Bha be.pst mi I nam in.poss.1sg shuidhe sit.vn 'I was sitting/seated.'
	- a. Sheas stand.pst mi I 'I stood (up).'
	- b. Bha be.pst mi I a' simp seasamh stand.vn 'I stood/was standing.'
	- c. Bha be.pst mi I nam in.poss.1sg sheasamh stand.vn 'I was standing.'
	- a. Chaidil sleep.pst mi I 'I fell asleep.'
	- b. Bha be.pst mi I a' simp cadal sleep.vn 'I slept/was falling asleep.'
	- c. Bha be.pst mi I nam in.poss.1sg chadal sleep.vn 'I was sleeping/asleep.'

<sup>4</sup> It is interesting that, in various dialectal varieties of English, one finds the use of the passive participle to mark the equivalent of the (c) examples here: %*I was stood/sat there*.

### David Adger

Simple stative verbs, such as *ciallaich*, 'mean', *faic*, 'see' and *crèid*, 'believe', are perfectly well formed with the simple aspectual particle, but not with the various forms of *ann an* in its aspectual incarnation:

(47) Scottish Gaelic


The crucial difference between simple statives and the stative verbs of position is that the latter involve a change of state followed by a temporary steady-state result of that change while the former do not specify any transitions at all. That is, the verbs of position are interval statives (Dowty 1979: 184) and the contribution of *ann an* is to signal that the predication is included in the interval. If we think of this using a locational metaphor, the state is represented as characterizing a temporal location for the subject.

If this characterization is correct, then we expect to see the *ann an* structure used when the action that leads to the steady-state is in fact non-canonical for such actions (for example, one can be standing even though the event that leads to this state is not an event of standing up). This is correct:


jump.pst I in.poss.1sg stand.vn

'I jumped to a standing position (literally, I jumped in my standing).'

This kind of data strongly suggests a kind of event decomposition, as argued for by Ramchand (2008): the state in which the subject is asserted to be is separated from the (sub-)event that initiates it in examples like these.

What of the kind of NP predication that we find in the p-strategy. Here too, the subject is characterized as being in a state which has a transitory nature. We can see this by using the standard temporal modifier test:

### 20 Rethinking the syntax of nominal predication

(50) Scottish Gaelic Bha be.pst Iain Iain na in.poss.3sg.m shuidhe sit.vn fad length uair hour a of thìde time 'Iain was sitting for an hour.'

(51) Scottish Gaelic Bha be.pst Iain Iain na in.poss.3sg.m oileanach sit.vn fad length dà two bhliadhna year 'Iain was a student for two years.'

If this semantic characterization is correct, it will explain the oddness of (52) as a result of the knowledge that one is not usually a cat for a temporary period, that is, it is equivalent to the oddness of (53) in English:

(52) Scottish Gaelic ?\* Tha be.prs Lilly Lilly na in.poss.3sg.f cat cat 'Lilly is a cat'

(53) ?\* Lilly is a cat for an hour.

From this perspective, (52) is actually perfectly grammatical, but it is inconsistent with what we know about what it means to be a cat, hence the acceptability judgment given. In fact, one of my consultants said that this sentence was fine if Lilly was a shape-changer, to express which she used the cleft-strategy!

(54) Scottish Gaelic nam if b' cop.cond e it shape-changer shape-changer a rel bh' be.pst innte in.3sg.f 'if she was a shape-changer.'

This approach will also explain why verbal states such as*ciallaich*, 'mean' (which lack such transitions) are impossible in p-strategy type structures, since *ann an* requires a state which has the appropriate interval property.

I analyse the syntax of these stative verbs of position by assuming the existence of a St functional category. St creates a bounded interval over which the property denoted by the root holds. Bounded temporal intervals are a kind of eventuality or situation. So I assume, like v, this category has an event variable, and introduces a specifier subject. I'll assume this is done via event-identification (Kratzer 1996), but an implementation in the theory of Ramchand (2008) is equally doable.

The relationship between the interval state given by the St head and the temporal structure of the remainder of the sentence is negotiated by Asp.

David Adger

This structure can be embedded under an initiating eventuality. In the case where that eventuality is a verb like 'jump' as in (49) above, we have Figure 20.1, where AspP is the complement of the aspectual structure of *leum*, 'jump' (for concreteness I assume the subject raises to its (nominative) case marking position, the specifier of TP, with the finite verb raised to Fin Adger 2007).

Agreement appears on Asp as a reflex of the movement operation affecting the subject, as in Figure 20.1.

In the situation where the verbal root is compatible with a process, Asp takes the verbal root directly (or a VP built from it), and introduces the subject via the aspectual head *ag/a'*, which signifies that the interpretation involves a process, as we saw above; see Figure 20.2:

(56) Scottish Gaelic


The general framework here follows Ramchand's in assuming that verbal meanings, including the aspectual meanings and introduction of arguments are distributed across various syntactic elements (see also Borer 2005).

Following this general framework, simple state verbs, like 'mean', 'see', 'believe', etc., also generate their subject in the specifier of AspP, rather than as a subject of St, much like process verbs, so (57) has the structure in Figure 20.3.

### 20 Rethinking the syntax of nominal predication

Figure 20.1: Structure of example (49).

Figure 20.2: Structure of example (56a).

### David Adger

(57) Scottish Gaelic Tha be.prs mi I a' simp faicinn see.vn a' the.gen chait cat.gen 'I see the cat.'

Figure 20.3: Structure of example (57).

For the verbs of position in their stative incarnation that we are concentrating on here, then, AspP is then Merged with TP, giving Figure 20.4 as a representation for (58):

```
(58) Scottish Gaelic
   Bha
   be.pst
           mi
           I
              nam
              in.poss.1sg
                           shuidhe
                           sit.vn
   'I was sitting/seated.'
```
With this syntax for interval statives in hand, the reason why Gaelic uses thisstructure, and why Gaelic nominal predication has restricted interpretation can be understood to derive from a basic difference in how nouns and verbs work. The theory developed in Adger (2013) takes nouns to be simple sortal predicates of individuals, and verbs to be predicates of eventualities. Indeed, in that theory, the roots are directly contained in a category N or V whose semantics is to introduce either an individual or an event variable.

### 20 Rethinking the syntax of nominal predication

Figure 20.4: Structure of example (58).

However, events have a semantic combinatory capacity to license arguments which are interpreted as participants of the event. This can be done either via some rule of event-identification (Kratzer 1996), or via a semantics which takes the extended projection of V to describe event structure directly (Ramchand 2008). Whatever the implementation, we can strengthen these proposals to the following:

(59) For an XP to act as a syntactic predicate, licensing an argument, it must have a semantically open eventuality variable.

If we put this proposal together with the idea that nouns are simple sortal predicates of individuals, the upshot is that apparent arguments of nouns have to be introduced as modifiers, while those of verbs can be introduced as specifiers. Adger (2013) uses this theory to explain why apparent arguments to nominals behave so differently to arguments to verbs in terms of their licensing, optionality and syntactic position. However, there is a further consequence not explored in Adger (2013): nominal predication cannot involve simply projecting a subject to a noun, as nouns cannot license arguments:

David Adger

Since there is no event variable here, Calum cannot be the syntactic subject of a nominal predicate. This is the reason why simple nominal predication is impossible in Gaelic:

(61) Scottish Gaelic

\* Tha Be.prs Calum Calum oileanach. student 'Calum is a student.'

The solution that Gaelic adopts is to allow St to combine with the root nominal first, as shown in Figure 20.5 and (62).

(62) Scottish Gaelic Tha be.prs Calum Calum na in.poss.3sg.m oileanach student 'Calum is a student.'

Here the root *oileanach*, 'student', is a property. Usually it will combine with a categorizer like n (or just N in Adger 2013's theory) which associates it with an individual level variable:

(63) ⟦N⟧ = Px.holds(P, x)

However, St combines with this property, associating it with a variable which ranges over temporally bounded states (cf. Carlsonian stages, Carlson 1977). I will represent such variables as s:

(64) ⟦St⟧ = Ps.holds(P, s)

### 20 Rethinking the syntax of nominal predication

Figure 20.5: Structure of example (62).

Temporally bounded intervals, even if they are temporally bounded intervals of individuals, are a sort of eventuality. This will allow a subject to be Merged to the (now non-nominal) predicate. The linkage between the nominal and the verbal here is, then, because the functional category St generates temporally bounded states, which are a kind of eventuality, even if the state is actually a stage of an individual.

This theory makes a prediction that modifiers which require an individual variable should be impossible in such structures. For example, relative clauses, which require a modification relation to be set up over individual variables, will be ruled out, as these structures never contain an individual level variable. This turns out to be correct:<sup>5</sup>

(65) Scottish Gaelic

\* Tha be.prs a her phiuthair sister na in.poss.3sg.f comhairle councillor a that gheibh get.fut a' the vote vote agam. at.1sg 'His sister is a councillor who I will vote for.'

<sup>5</sup>Many thanks to Jason Ostrove for testing a number of these examples for me while on fieldwork in the Hebrides.

### David Adger

Here, *ann an* combines with StP, which denotes a temporally bounded period of an individual (a stage), not an individual. A relative clause combines with an individual (via predicate modification), and hence is impossible here.

A restricted range of modifiers that can work at the stage level, such as *ùr*, 'new', are correctly predicted to be acceptable:

(66) Scottish Gaelic

Tha be.prs Calum Calum na in.poss.3sg.m oileanach student ùr new 'Calum is a new student'

The adjective *ùr*, 'new', modifies a temporal aspect of being a student, and hence is acceptable.

This approach also predicts the absence of quantifiers and numerals in the Gaelic structures. Even though numerals and weak quantifiers are usually thought of as maintaining the predicative type of an NP, they are impossible in the p-strategy.

	- a. \* Bha be.pst iad they nan in.poss.3pl còig five oileanaich students 'They were five students.'
	- b. \* Bha be.pst iad they nam in.poss.3pl mòran many oileanaich students 'They were many students.'

The effect follows straightforwardly on the account given here: stages are things that can't be counted (numerals and quantifiers, again, require individual variables).

The fact that these numerals are possible in the cleft-strategy provides a further argument against the unified analysis of the two strategies that I sketched in section (3):

	- a. 's cop e it còig five oileanaich students a rel bh' be.prs annta in.3pl 'They were five students.'
	- b. 's cop e it mòran many oileanaich students a rel bh' be.prs annta in.3pl 'They were many students.'

### 20 Rethinking the syntax of nominal predication

Schreiner (2015) presents an analysis of the p-strategy that covers some of the same empirical ground as that presented here. She develops the proposals of Roy (2006), arguing that nominals, in general, have an event variable, and that different kinds of functional structure generated above Ns give rise to the interval stative property. In Gaelic predicative structures, the nominal has to denote what Roy calls a dense predicate (essentially, dense predicates are temporally homogenous; they are analogous to mass predicates, which are homogeneous in mereological structure).

Schreiner's syntactic analysis takes the constituent headed by *ann an* in the p-strategy to be a true PP, with a full DP as its complement. This DP obligatorily has a possessor inside it, which is responsible for the agreement on *ann an*. However, this is inconsistent with the restricted set of modifiers that these nominal predicates allow. While the absence of numerals is expected, if nominal roots in these structures have to be homogeneous, the absence of relative clauses is surprising (relatives are well formed with mass nominals, of course).

To a certain extent, Schreiner's analysis and mine are compatible in terms of the interpretations available for the nominal predicate, as both rely on a specialised functional structure generated above the nominal root. However, because, for Schreiner, Ns have an event variable, her analysis doesn't provide a straightforward explanation for the impossibility of simple NP predication as in (69), which I take to be a desideratum:

(69) Scottish Gaelic

\* Tha Be.prs Calum Calum oileanach. student 'Calum is a student.'

Schreiner suggests that this may have something to do with transnumerality in the language, and suggests that nouns in Gaelic are number neutral (unspecified for number). However, most nominals in Gaelic, and certainly all the ones in the examples discussed here, work morphologically and semantically as simple count or mass nominals. Strikingly, when the subject is plural, the predicate nominal has to be plural too:

(70) Scottish Gaelic

a. Tha be.prs sinn we nar in.poss.1pl deugairean teenager.pl 'We are teenagers.'

### David Adger

b. Tha be.prs i she na in.poss.3sg.f deugaire teenager 'She is a teenager.'

We can make sense of this if the root, in fact, bears a plural property (e.g. it will apply to some non-atomic point in a lattice, as in Link 1983) vs. a singular property. This means that when the predicate applies to the s variable via St, it is a predicate of stages of multiple individuals. I don't see how these facts about number marking on the predicate nominal can be made compatible with a proposal that nouns are number neutral. These facts are even more striking given the impossibility of number agreement (or any φ-agreement) on predicate adjectives:

	- a. na the.m.pl balaich boy.pl mòra big-pl 'The big boys.'
	- b. Tha be.prs na the.m.pl balaich boy.pl mòr big 'The boys are big.'
	- c. \* Tha be.prs na the.m.pl balaich boy.pl mòra big-pl intended: 'The boys are big.'

Adjectives agree in number in attributive position, but not in predicate position. Predicate position, then, is not accessible to agreement (which conforms with the generalization that verbs do not agree with their subjects in Gaelic). But then that suggests that number in examples like (70) is semantically interpreted, and that nouns are not number neutral. An account of the impossibility of simple nominal predication in Gaelic resting on the idea that nouns are number neutral is untenable.

# **6 Motivating the interface analysis: The cleft-strategy**

I turn now to the cleft-strategy. The claim here is that the apparent predicate is a subject, but it is the subject of a higher level predication. That is, it is similar to the copular predication mentioned in footnote 3:

20 Rethinking the syntax of nominal predication

(72) Scottish Gaelic Is cop eun bird sgarbh cormorant 'A cormorant is a bird' (Generic)

In (72) the subject NP *sgarbh*, 'cormorant' is asserted to be in the set (of sets) denoted by the predicate *eun*, 'bird'.

Adger & Ramchand (2003) argue that this kind of structure involves a predicational head which raises to a higher position, pied-piping its complement, and creating a predicate inversion structure:

The predicational head is *is*. Adger and Ramchand give *is* a semantics which allows it to combine with a nominal, and assert that the property the nominal denotes holds of a subject as follows.

(74) Px.holds(P, x)

The motivation for this semantics is that *is* cannot occur in tensed sentences. It has only two forms: *is*, which marks that the proposition currently holds, and *bu*, which marks that it doesn't currently hold. It may have held in the past, be going to hold in the future, or be a possibility. This copular element then seems to mark a distinction which is close to a notion of "current actuality", perhaps to be related to evidentiality.

Importantly, for the claims I am making here, the copular structure in Gaelic does not involve predication in the normal sense: the "subject" is not a participant in a situation and is not a thematic argument of the apparent predicate. Rather the copula here denotes a pure inclusion relation: the set of cormorants is in the set of birds. The label Pred here, then, is somewhat misleading, and I'll replace it with simply Cop.

Adger and Ramchand extend their idea to apparent equatives in Gaelic, which have a surface form reminiscent of clefts:

David Adger

(75) Scottish Gaelic 's Cop.prs e it Calum Calum an the oileanach student 'Calum is the student.'

We argued that in these constructions the pronominal element *e* acts as the complement to the copula. This pronoun is then anaphoric to a right adjoined definite DP:

Equatives, then, do not exist and equative meanings are constructed via a copular structure plus an anaphoric dependency.

My suggestion here is simply to extend this idea to true clefts, and specifically to clefts that involve apparent NP predicates. The copula signals inclusion of one class in another in (72), and it performs an identical function in the cleft-strategy for nominal predication.

There are two analytical premisses that underlie this claim: the first is an analysis of the syntax and semantics of the relative clause part of the cleft-strategy; the second is an analysis of what motivates the obligatory nature of the clefting process.

The first premiss is fairly straightforward to motivate: the preposition *ann an* in the relative clause portion of the cleft-strategy behaves, as we have seen, like a normal preposition, so we can assume it is syntactically a true preposition with a DP complement. That is, we have the following syntactic structure:

(77) [PredP NP [Pred' Pred [PP in DP ]]]

The associated semantics to be justified is that this PP functions as a predicate for a property-denoting subject NP. That is, the DP here is a generalized quantifier, denoting a set of properties and the whole structure is interpreted as asserting that the set of properties denoted by the NP is included in this. This is

### 20 Rethinking the syntax of nominal predication

similar to the copula, but involves the situational variable usually connected to PP predication.

However, this seems inconsistent with an observation discussed in §3. There I showed that structures of the following sort cannot be used to make a nominal predication:

(78) Scottish Gaelic

\* Tha be.prs oileanach student ann an in Calum Calum intended: 'Calum is a student.'

This claim, although true, is not the whole story. In fact this kind of structure can be used to say that Calum has student qualities, although he is not a student. For example, if Calum is a one-year old child, but likes playing with books, then (78) is an appropriate comment. So the \* judgment in (78) refers not to a structural impossibility, but to an impossible reading for that structure. It is in fact well formed with the reading that Calum has student qualities.

Similarly, one can say:

(79) Scottish Gaelic Tha be.prs ceann head mòr big ann an in Calum Calum 'Calum is big-headed.'

(79) cannot mean that Calum literally has a big head, but it can mean that he has the qualities associated with big-headedness. In fact, this structure can be used to state that the complement of the P has the inherent quality denoted by the NP in general. Let us roughly symbolize this as (80), where the function Qual returns a set of properties associated with the property denoted by the NP.

(80) Qual(NP) is a set of properties such that each property is characteristic of the individuals denoted by NP

This kind of predication is equivalent to that seen in English constructions like (81):

(81) I see an excellent king in Jason.

Here Jason is not necessarily a king, and certainly not an excellent one, but he has the qualities necessary to be one.

### David Adger

The interpretations of sentences like (79) motivate the idea that the relative clause part of the cleft-strategy has a syntax involving an NP subject with a PP predicate and a semantics where the NP subject denotes a set of properties asserted to be included in the properties denoted by the complement of the preposition *ann an*.

The second part of the analysis that still needs to be explained is why the relativization is obligatory. Why doesn't Gaelic just allow (78) with the meaning 'Calum is a student'?

The answer to this is that the peculiar quality reading of these NP subjects is lost whenever the quality denoting NP is extracted.

Both of the following examples have only literal readings:

### (82) Scottish Gaelic

\* Dè What an the oileanach student a rel th' be.prs ann an in Calum Calum intended: 'What kind of student is Calum?'

### (83) Scottish Gaelic

\* 'S cop e it ceann big mòr head a that th' be.prs ann an in Calum Calum intended: 'It's big-headed that Calum is'

The reason for this is not entirely obvious, but the generalization is clear, and constitutes the second step of the argument for justifying the analysis presented here:

(84) Qual cannot apply to an A-bar bound element.

This seems to be true in English as well. The relevant reading is only preserved under extraction when the noun 'kind' is used:

(85) a. What kind of a king do you see in Jason?

b. \* What king do you see in Jason?

Similarly for Gaelic:

(86) Scottish Gaelic

Dè What an the seòrsa sort oileanach student a rel th' be.prs ann an in Calum Calum 'What kind of a student is Calum.'

20 Rethinking the syntax of nominal predication

(87) Scottish Gaelic

\* Dè What an the oileanach student a rel th' be.prs ann an in Calum Calum intended: 'What kind of student is Calum?'

I'll follow Adger & Ramchand (2005) and Adger (2011a) here and take the view that wh-movement, relativization and clefting in Gaelic all involve an Abar bound bare resumptive pronoun, although nothing about the story presented here changes if we have, instead, a trace of A-bar movement.

With these two analytical premisses in place, we can now take (88) to be the base structure to which the cleft applies:

(88) Tha be.prs pro pro ann an in Calum Calum = ℘ ∈ P.P(Calum)

Here the pronominal is an NP, and its interpretation is as a variable (℘) ranging over properties. The preposition *ann an* asserts that whatever property is assigned to pro will be included in the set of properties denoted by Calum. The structure here is the same as (79), but with the subject NP being a pro ranging over properties.

Relativizing over this structure, we create a predicate of properties:

(89) a th' pro ann an Calum = ℘.℘ ∈ P.P(Calum)

The function Qual cannot not apply, since pro is A-bar bound.

Putting this outcome together with the analysis I motivated for copular clauses, we derive the structure in Figure 20.6 for the cleft-strategy.

(90) Scottish Gaelic

'S cop e it oileanach student a rel th' be.prs ann an in Calum Calum 'Calum is a student.'

Here the relative clause *a th'ann an Calum* abstracts over the property variable denoted by the pro in the specifier of TP, giving the meaning of the relative clause as a set of properties which are properties of Calum. The pronoun in the copular clause gets its meaning by straightforward substitution, and the copula asserts that the property of studenthood is in the set of properties that Calum has. This analysis simply extends the analysis of clefts I offered in (Adger 2011b) to these characterising clefts.

Figure 20.6: Structure of example (90).

The final question is, for this kind of reading, why the cleft is obligatory. The answer to this, from the perspective outlined here, is simply that the Qual function would otherwise apply to the subject of the clause. It may be that this function is itself connected to some syntactic position (for example, perhaps Qual can only apply to case marked DPs, and A-bar bound pro does not have to be case marked because of its lack of overt morphology), but I leave this question open here.

## **7 Conclusion**

A standard view of predicate nominals (e.g. Partee 1987; Higginbotham 1987) is that some projection of the nominal has a predicative type (〈e, t〉) and that this is what is seen in apparent examples of NP predication. In developments of such theories, we see three "layers" of projection in the DP (e.g. Zamparelli 2000): a kind level, a predicative level, and an argumental level. The predicative level is that used in cases of NP predication.

### 20 Rethinking the syntax of nominal predication

However, this is clearly not the case in Gaelic, and the question is why?

One possibility is that Gaelic lacks the predicative projection of the nominal. It has only a property level projection, and an argument level projection (this is the view taken in Adger & Ramchand 2003). But this is stipulative. The alternative I suggest is that subjects of predication in syntactic specifier positions are generally impossible in nominals, as such subjects require eventive functional structure to be introduced. The category N creates predicates of individuals, not events, and the extended projection of N develops the semantics of an individual, not of a state of affairs. This set of constraints on the syntax–semantics interface leaves languages with a problem: how do they build the meaning of NP predication? Gaelic shows us two ways in which a language can solve this problem. The p-strategy involves co-opting structure which does have an event variable, while the cleft-strategy uses a relative clause to create the necessary semantic glue.

What of languages like English? Nominal predication is restricted in such languages too, when the presence of the verb *be* is controlled for. Nominals are decidedly odd in *be*-less predication compared to PPs and APs:

	- b. With Lilly sick, we should get some special cat food.
	- c. With Lilly under anaesthetic, we can go ahead with the operation

From the perspective of the theory offered in this paper, English *be* is performing a function similar to, but more general than, Gaelic *ann an*. Indeed, even with *be*, we can see the same restriction we found in Gaelic, where, when the predicate is restricted to be an interval state by using a temporal modifier, relative clause modification becomes impossible:

(92) ?\* Calum was a student for three years that Ian knew.

The same core principles regulating the relationship between syntax and semantics are at work in both kinds of languages, but they evade the restrictions imposed by those principles in different ways.

## **Abbreviations**


### David Adger


## **Acknowledgements**

One of my first grown-up conference papers was at a Celtic syntax workshop organised by Ian Roberts and Bob Borsley in Bangor. That paper argued that measure phrases in Scottish Gaelic were a kind of defective nominal, and because of this defectiveness, they are incorporated into the syntactic and semantic dependencies set up by the verbal extended projection. This paper written in appreciation of Ian's important impact on my linguistic thinking returns to that exact same intuition for predicate nominals, showing either that I'm stubborn, or can't move on! I've presented this set of ideas at the workshop on predication in Ontario, 2009, then, after a long hiatus, at the Université de Paris VIII in 2014 and at the University of Ulster in 2015. Many thanks to all for comments and suggestions as well as to Caroline Heycock for comments on an early version. Many thanks also to Iseabail NicIlleathain and Sìleas NicLeòid for help with data, to Jason Ostrove for checking some examples for me while he was in the field, and to two anonymous reviewers for this volume.

# **References**


### 20 Rethinking the syntax of nominal predication


### David Adger


# **Chapter 21**

# **Rethinking principles A and B from a Free Merge perspective**

# Marc Richards

Queen's University Belfast

This squib sketches out the beginnings of a bottom-up, minimalist rethinking of pronominal reference constraints (essentially, principles A and B of the binding theory) in terms of an approach to grammar-internal optionality originally pursued in Biberauer & Roberts 2005, Biberauer & Richards 2006. By combining a movement theory of binding (Hornstein 2001; 2013; Kayne 2002; Abe 2014) with phase theory (Chomsky 2000 et seq.), the essential difference between local binding and local obviation reduces to the choice between Internal Merge and External Merge at the phase level, each yielding a distinct interpretive outcome at the conceptualintentional (CI) interface. Further, if the phase constitutes the maximal domain in which linguistic constraints can apply, then interpretive freedom is expected beyond the phase level. In this way, restrictions on the interpretation of pronouns turn out to be the CI equivalent of ordering restrictions at the sensorimotor interface (PF), which likewise obtain up to the phase level but not beyond (Richards 2004; 2007).

# **1 The price of freedom**

In its more recent developments, the Minimalist program has moved away from its earlier emphasis on the formal features that trigger operations and the formal constraints that restrict them. Accordingly, from the perspective of the strong Minimalist thesis (SMT), in which language-specific technology is expensive (i.e. adds to the "first factor"; Chomsky 2005), optionality should no longer surprise

Marc Richards. 2020. Rethinking principles A and B from a Free Merge perspective. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 497–509. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972874

### Marc Richards

us. The free application of operations is the default expectation.<sup>1</sup> Whereas earlier Minimalism (Chomsky 1995) viewed optionality as problematic, with optional rules and operations effectively excluded by a conspiracy of last resort and full interpretation, it is in fact "obligatoriness" that is unexpected, as any limitation on this freedom has to somehow be legislated for in the form of a language-specific rule or constraint, thus departing from the SMT (unless this restriction can be reduced to more general, "third-factor" considerations). By contrast, there is no need to legislate for optionality. A maximally empty, minimally specified UG will necessarily leave many options open, giving rise to operational indeterminacies, as explored and exploited in "underspecification" models of (parametric) variation (see, e.g., Uriagereka 1994; Biberauer & Richards 2006; Berwick & Chomsky 2011; Richards 2008; Kandybowicz 2009; Boeckx 2011; Roberts & Holmberg 2010); it also leads naturally to an "overgenerate and filter" view of the syntax–interface relation (see, e.g., Richards 2004; 2007 on the syntax–phonetic form (PF) relation), perhaps based on *Free Merge* (cf. Chomsky 2007; 2008; 2013; 2015 – see footnote 1; also Boeckx 2011). Operative freedom itself now comes for free; it is the restrictions on this freedom (rules, constraints: the mechanisms of obligatoriness) that come at a price, carrying the burden of explanation.

In this light, we need to reconsider how (and where) apparent strictures (or their effect) might arise in this system. A simple way to curb the excesses of a free syntax is to make it responsible to the interfaces, so that the choices we make (in the syntax) have consequences (at the interface). From this perspective, sometimes called *interface economy* (cf. Reinhart 1995; Fox 2000; Chomsky 2001; Biberauer & Richards 2006), the choice of applying a syntactic operation like Merge may itself be free, but this choice must be cashed out at the interface in the form of an interpretive effect – an *effect on outcome* (EOO; Chomsky 2001: 34). Optional operations thus have an obligatory EOO. Equally, where a derivational option is independently excluded,<sup>2</sup> we might expect the opposite pattern to obtain. These two scenarios were summarized in Biberauer & Richards (2006) as in (1).

	- b. Obligatory operations feed optional interpretations.

<sup>1</sup>Cf. Chomsky (2015: 10–11) on "the lingering idea, carried over from earlier work, that each operation has to be motivated by satisfying some demand. But there is no reason to retain this condition. Operations can be free, with the outcome evaluated at the phase level for transfer and interpretation at the interfaces".

<sup>2</sup> For example, the phase impenetrability condition might exclude the option of Internal Merge, where this would cross a phase boundary. See §2.2 below.

### 21 Rethinking principles A and B from a Free Merge perspective

The refinement I would like to propose and pursue here is that an EOO will only be discernible up to a certain point in the derivation, namely the phase level. In terms of Biberauer & Richards (2006), this means that the phase is the level at which the system "minds" (i.e. the level at which the derivational choices within a phase are made to count). Beyond the phase level, the system stops caring,<sup>3</sup> and interpretive freedom will therefore result (i.e. a lack of EOO, equivalent to 1b). Let us refer to this as Claim 1, as in (2).

(2) Claim 1

The phase is the maximal domain in which syntactic/interpretive constraints can apply. Each choice within a phase registers a distinct EOO at the interface.

Effectively, the EOO rationale in (1a), in combination with phases, will conspire to *give the illusion* of local (syntactic) constraints. In terms of (free) Merge, the choice between applying Internal or External Merge at a given point in the derivation – yielding copies versus repetitions, respectively, at the interface – can only make a difference within a phase. The relevance of the copy/repetition distinction at the interface is therefore predicted to break down beyond the phase level, as (3) ostensibly confirms.

(3) *He*<sup>i</sup> thinks [CP that *he*i/j can help Mary ]

Here, due to the intervening CP phase boundary, the higher instance of *he* may be interpreted as either a copy of the lower *he* (hence referentially identical), or else as an independent repetition (hence with independent reference). By contrast, where this choice is made within a phase, EOOs are predicted to arise, as summarized in Claim 2.

(4) Claim 2

Merge *within* a phase will be constrained (e.g. subject to particular interpretive restrictions) in a way that Merge *across* phases is not.

At PF, this yields order-preservation constraints on phase-internal movement (Richards 2004; 2007), as I shall briefly review in §2.1. This then leads to my main claim, in §2.2 — namely, that binding conditions (principles A and B) can be rethought of, and made sense of, as the conceptual-intentional (CI) equivalent of order preservation at PF.

<sup>3</sup>This follows from the idea that phases are the units of computation, and that there is no memory of derivational information beyond the phase level (cf. Chomsky 2015: 8: "The basic principle is that memory is phase-level – as, e.g., in distinguishing copies from repetitions").

### Marc Richards

## **2 Escape to freedom**

The "obligatoriness" of local binding and obviation constraints, as captured by principles A and B of the binding theory, is unexpected from the minimalist perspective set out in the previous section. A Roberts-style "rethink" of this pervasive property of human language is therefore in order, with the aim of reconciling it with the SMT. If we can rationalize and naturalize the binding principles in terms of (2) and (4), i.e. as emergent EOOs, we will have gone some way towards achieving this aim. To see how this might work, it is worthwhile revisiting the analysis of Holmberg's generalization from Richards 2004, in which (2) and (4) conspire to constrain the interpretive output of Merge at the PF interface.

### **2.1 Phase-internal interpretive restrictions on Free Merge at PF: Order preservation**

There is evidence to believe that local movements such as object shift are subject to certain ordering restrictions that do not hold of longer-distance or successivecyclic movement. For VO languages, this restriction is famously captured under Holmberg's generalization (HG; Holmberg 1986; 1999); essentially, "VO in" implies "VO out", thus excluding object shift in cases where the verb does not move to a position above the object, as in (5b).

	- a. Nemandinn the.student las read [*v*\*P (*bókina*) the.book *t*nemandinn ekki not [VP *t*las (*bókina*) ]] the.book "The student didn't read the book."
	- b. Nemandinn the.student hefur has [*v*\*P (*\*bókina*) the.book *t*nemandinn ekki not [VP lesið read (*bókina*) ]] the.book "The student hasn't read the book."

Taking short-distance movement of the object shift kind to be *v*P- (and thus phase-) internal, the relevant generalization seems to be that ordering freedom arises only once the *v*P phase is escaped. Thus longer-distance (cross-phasal) movement out of the *v*P phase is free to invert the original order, as in the case of A-movement/passivization, *wh*-movement, topicalization, etc.

	- b. John was [*v*<sup>P</sup> rescued (*John*) ]
	- c. John, I [*v\**<sup>P</sup> like (*John*) ]
	- d. Which book did you [*v\**<sup>P</sup> read (*which book*) ]

### 21 Rethinking principles A and B from a Free Merge perspective

The constraint on short-distance movement such that the derived order must reinstate the base order is an unexpected limitation on Free Merge; it is another unexpected instance of "obligatoriness". The phase-internal nature of this constraint, combined with the assumption that linear order is imposed only at the sensorimotor interface and is not a property of the syntactic structure itself,<sup>4</sup> suggests an approach to HG in terms of *cyclic linearization* (i.e. linearization by phase). Such a system is notably proposed in Fox & Pesetsky (2005), with the interesting property that ordering freedom is allowed within a phase but not beyond, contra the claims in (2) and (4) above. An alternative is offered in Richards (2004; 2007), in which the same effects are delivered by the opposite set of assumptions – i.e., ordering freedom is allowed beyond the phase but not within, in conformance with (2) and (4). This alternative follows from a Merge-based linearization algorithm in which (symmetrical) Merge overspecifies the word order between Merge pairs (sisters), giving PF both options each time (head-first, headfinal); cf. Epstein et al. 1998. Then, at the phase level, the interface simply discards one of these options, consistently. Such an "overgenerate-and-filter" approach to linearization may be expressed as in (7).

	- a. Head-initial = delete all *Comp* < *Head* [i.e. {〈α,β〉, 〈β,α〉} → {〈α,β〉, 〈β,α〉}]
	- b. Head-final = delete all *Head* < *Comp* [i.e. {〈α,β〉, 〈β,α〉} → {〈α,β〉, 〈β,α〉}]

The contrast between (5) and (6) is a straightforward consequence of this system. As depicted in (8), short object displacement to spec-*v*P across V is only orderable by (7a) where further movement of V across the displaced object takes place, so that the latter becomes the tail of a V < O chain, rather than the head of an O < V chain. (Any such O < V instruction would be deleted and thus "undone" at PF, by 7a.)

(8) Object shift (phase-internal)

<sup>4</sup>This long-standing insight is first elaborated in Chomsky (1995: 334–340); more recently, it finds expression in the claim that "[o]rder is relegated to externalization" (Chomsky 2015: 4).

### Marc Richards

The upshot is that HG is derived for exactly that subset of languages in which it holds (i.e. those set to (7a): VO languages). Beyond the *v*P phase level, however, the information about the original ordering sister is lost, due to phase-level memory (cf. footnote 3), and the displaced DP is effectively relinearized in the higher phase (hence the possibility of inverted orders, as in 6). Interpretive freedom at PF is thus the result of escaping the phase; the expected optionality re-emerges beyond the phase level.

### **2.2 Phase-internal interpretive restrictions on Free Merge at sem: Binding principles**

An obvious question is what the equivalent of PF order preservation would be at the CI-interface. Is there a similar basic pattern to the one in (5–6) in which Merge choices made locally (within the *v*P phase) are interpretively constrained at the interface, with interpretive freedom again re-emerging once the phase is escaped? My contention here is that principles A and B of the binding theory instantiate just this pattern, and thus again implicate a minimalist system based on (2) and (4).

Clearly, in order to reconstruct the principles of binding in terms of Merge choices, some version of a movement theory of binding (MTB) must be assumed (Hornstein 2001; 2009; 2013; Kayne 2002; Abe 2014), with anaphors and/or pronouns analysed as pronounced lower copies (cf. also Heinat 2003). The present article is not the place to provide a full justification of the MTB or to pursue the technicalities of lower-copy realization (see above references and related work); suffice it to say that I take the MTB to be the null hypothesis in a system of unconstrained ("free") Merge, in which Internal Merge to θ-positions cannot (and should not) be excluded in the syntax, and in which Internal Merge provides the simplest possible mechanism by which to derive referentially identified occurrences (tokens), in the form of copies. However, in a crucial departure from earlier versions of Hornstein's MTB,<sup>5</sup> it cannot be the case that anaphors and pronouns (principles A and B) stand in an "elsewhere" relation, such that pronouns result wherever movement is not possible. Rather, the present system relies on there being a critical choice point (within the phase) where both options (Move and Merge) are equally available, with each choice then yielding a complementary outcome at the interface.

We restrict ourselves here to considering just the core facts of principles A and B. Our aim is to simply derive the complementary distribution of anaphors

<sup>5</sup>More recent versions, such as Hornstein (2013), come a lot closer to the present proposal.

### 21 Rethinking principles A and B from a Free Merge perspective

and pronouns within a given local domain, and thus the fundamental difference between obligatory binding and obligatory obviation. These core facts are given in (9).


To derive the contrast between (9a) and (9b), consider first the derivation at the point where the external argument (EA) is merged, after *v*\* has been merged with its complement VP. At this point, there is a free choice between Internal Merge (IM) or External Merge (EM): either option is in principle possible here (and in practice too, as long as the VP and its contents have not yet been transferred). Since this choice is made phase-internally, the information as to which choice is made is available at the interface, upon Transfer. Each option is therefore exploited at the interface in the form of a different EOO (cf. 2).

According to the first option, the internal argument (IA) may be raised to spec*v*P to form the EA, as in (10).<sup>6</sup>

(10) Option 1: Internal Merge of the IA to form the EA [*v*\*P he *v*\* [VP likes him (→ himself) ]]

Since IM is chosen and IM here is optional (given the availability of another option, viz. EM), this choice must have an effect at the CI-interface (cf. 2). The two occurrences of the relevant lexical item are detectable as copies at the phase level; therefore, the result (EOO) is obligatory referential identity at CI (i.e. *he* = *himself*,

<sup>6</sup>The lower copy here is spelled out overtly, as an anaphor, and not deleted or left unpronounced, as it is in the case of passive/unaccusative IM of the IA. The salient difference between the two cases that accounts for this divergence is the nature of the *v* head. The defective *v* associated with passives/unaccusatives is unable to value Case on the IA (cf. Chomsky 2001). The IA thus remains active, raising automatically to the phase edge to evade Transfer (cf. Chomsky 2000). Since the lower (active) copy is not transferred, it cannot be realized at PF (i.e. pronounced). By contrast, (10) involves a transitive *v*\*, which values Case (accusative) on the IA. Thus deactivated, the lower copy of the IA is a candidate for Transfer and thus for PF-realization.

### Marc Richards

or a covariant/bound-variable reading with a quantificational antecedent, as in *Every boy likes himself* ), in line with (1a).

Alternatively, the other option is for EM to apply at this stage, as in (11).

	- EM<sup>2</sup> EM<sup>1</sup>

Since EM is chosen and EM here is optional (given the availability of another option, viz. IM), this choice must likewise have a distinct effect at the CI-interface. The two occurrences of the relevant lexical item are detectable as independent repetitions at the phase level; therefore, the result (EOO) is obligatory disjoint reference at sem (i.e. *he* ≠ *him*, or the absence of a bound-variable reading with a quantificational antecedent, as in *Every boy likes him*), again in line with (1a).

Turning finally to (9c) and (9d), here the two indexed positions cannot be derivationally related by IM. In the case of (9c), this is due to the presence of at least one intervening phase boundary (the CP headed by 'that'). The embedded IA is therefore rendered inaccessible to the matrix subject position, in accordance with the phase impenetrability condition. In (9d), an interarboreal or sideward dependency would be required to link the two positions. It is arguable that such dependencies do not conform to the simplest conception of Merge (cf. Chomsky 2007): in this case, *him* is not contained in the sister of *his*, and thus *his* cannot be the result of IM of *him*. In both cases, therefore, only EM is possible.<sup>7</sup> Since EM is now obligatory (there being no option of IM, unlike in (10–11) above), it will be associated with interpretive freedom, in line with (1b). Consequently, incidental coreference/covariance becomes a possible interpretation. As with the trans-phasal dependencies in (6), crossing a phase results in liberation at the interface. This opening up of interpretive possibilities has the interesting consequence that there are two derivational sources for the same interpretation. Thus, for example, a bound variable may be derived either via the phase-internal, obligatory route (cf. 9a), or via the cross-phasal, optional route (as in 9c,d). I leave further exploration of this consequence for future research.<sup>8</sup>

<sup>7</sup>The same is true for those cases where the lower pronoun (bound or otherwise) is contained within an island, such as *Every actor<sup>i</sup> denied the rumour that the studio fired him*i/j.

<sup>8</sup>Hornstein (2013) independently argues for a non-uniform approach to bound variables (i.e. those which are the product of movement and those which are not), on compelling empirical grounds. The approach proposed here thus lends further support to Hornstein's hunch. Note, too, that any c-command requirement on bound variables will only characterize the first kind (the local, IM-derived kind). Thus bound variables are readily available in (9d)-type config-

### 21 Rethinking principles A and B from a Free Merge perspective

## **3 Conclusion**

In the same spirit as Hornstein (2009; 2013), we have tried to shed light on the question of why restrictions such as the binding principles should exist at all (i.e. why they should be a characteristic property of human language). The answer we have begun to develop here offers a potential first step in "rethinking" the binding theory from the ground up. It is based on the idea that whilst Merge itself might be free, its interpretation is not (up to the phase level), due to the EOO rationale in (1/2). The MTB in conjunction with phases then delivers the *effect* of interpretive constraints (principles A and B).<sup>9</sup> Binding conditions reduce to the differential interpretation of free Merge choices within a phase (i.e. the maximum domain in which the system can "care"): the choice between IM and EM is cashed out at the interface in a complementary manner, yielding obligatorily coreferent copies (local binding) versus obligatorily disjoint repetitions (local obviation), respectively. By contrast, interpretive freedom (including optional coreference) arises with cross-phasal dependencies, as default optionality re-emerges beyond the phase level.

Finally, it should be noted that the sketch presented above leaves many questions open and avenues unexplored. I am grateful to two anonymous reviewers and an editor for highlighting some of these. Amongst the most immediate empirical challenges facing this approach are long-distance reflexives and other cross-clausal referential dependencies, such as those holding between a null embedded subject and a matrix overt subject in null-subject languages in structures like (3); non-local SE anaphors (contrasting with local SELF anaphors) are another relevant point of variation here (cf. Reinhart & Reuland 1993; Lidz 2001). Such cases present a problem for the model proposed here, as they all involve obligatoriness effects that appear to hold beyond the phase level, i.e. where optionality would be predicted (cf. 9c). An approach in terms of cancellation or

urations, as in *Everyone<sup>i</sup> 's mother likes him*i/j, where (importantly) the non-coreferential/noncovariant interpretation of *him* is also an option. The same goes for non-local variable binding, as in *Every criminal<sup>i</sup> thinks the police are after him*i/j, instantiating (9c), where again the bound reading is only optional. As discussed in §1, there is no need for the grammar to legislate for optionality, as this is the default state of affairs from the minimalist perspective; only nonoptional, forced readings are unexpected and demand an explanation.

<sup>9</sup> Similarly, the phase delivers the *effect* of the government-and-binding theory (GB) binding domain, since it is at the phase level that these choices apply and are made to count. Clearly, this is not the same as claiming the phase to actually *be* the binding domain (redux) in any primitive sense, in which pronouns must be free and anaphors bound; see e.g. Uriagereka & Gallego 2006; Hicks 2009; Sabel 2012 for other ways to conceive the relation between binding domains and phases.

### Marc Richards

extension of the intermediate phases suggests itself for such cases of non-local binding (see Livitz 2016 for such an analysis of Russian embedded null subjects), or else the relevant variation might be attributed to the nature of Transfer itself (cf. the distinction between weak and strong Transfer implied in Chomsky 2008). A reviewer also asks about non-complementary distribution, i.e. configurations in which both the pronoun and the anaphor freely alternate and are equally acceptable (or indeed, equally unacceptable, as in the cases of overlapping reference discussed in Reinhart & Reuland 1993). It is important to note in this connection that the present approach takes only obligatoriness, not optionality, to demand an explanation under the SMT and a minimally specified UG (cf. §1; indeed, its main conceptual advantage is that it only seeks to explain what needs to be explained, reducing the core binding facts to principled variation and leaving the rest open to free variation). More specifically, interpretively constrained pronominal/anaphoric forms are predicted to arise only where two Merge options (internal and external) compete at the phase level. Where either Internal or External Merge is unavailable (cf. footnote 2), interpretive optionality and thus non-complementarity should re-emerge, at either or both interfaces. For sem, an example of such non-complementarity has already been discussed (the freely interpreted embedded pronoun in (9c)); the phon equivalent (i.e. multiple realizational options) is no less expected, and may be manifested in the form of pronoun/anaphor interchangeability, as found in certain DP and PP configurations. These tentative suggestions indicate at least some of the empirical and theoretical directions in which the current approach might be immediately extended.

## **Abbreviations**


## **References**

Abe, Jun. 2014. *A movement theory of anaphora*. Berlin: De Gruyter. DOI: 10.1515/ 9781614516996.

21 Rethinking principles A and B from a Free Merge perspective


### Marc Richards


21 Rethinking principles A and B from a Free Merge perspective


# **Chapter 22**

# **Beyond one, two, three: Number matters in classifier languages**

Cherry Chit-Yu Lam

The Open University of Hong Kong

Chinese has been widely recognised as a classic example of a numeral-licensing classifier language, where the presence of a classifier is obligatory for overt quantification of nouns. This paper presents new data from Mandarin and Hong Kong Cantonese (HKC) to show that the need of classifiers for quantification is not always that absolute. Systematic variation has been found with an extended range of numerals examined (numerals larger than three), and a wider coverage of nouns in terms of animacy. The findings present a consistent pattern that HKC has a stricter requirement for classifiers in enumeration as bare common nouns are not definite in HKC, and it lacks the alternative strategies found in Mandarin.

## **1 Introduction**

Chinese, particularly Mandarin, has been an exemplar language with numerallicensing classifiers. This paper presents new data from mainland Mandarin and Hong Kong Cantonese (HKC) which contradicts such a neat understanding.

It is generally understood that, in Mandarin and HKC, whenever overt quantification is expressed in a noun phrase, whether by quantifiers like *jǐ* (HKC *gei2*) 'some', or numerals like *sān* (HKC *saam1*) 'three', a classifier must be present, regardless of mass-count distinction (1–2).

(1) Mandarin (Chierchia 1998: 92; Cheng & Sybesma 1999: 519)


Cherry Chit-Yu Lam. 2020. Beyond one, two, three: Number matters in classifier languages. In András Bárány, Theresa Biberauer, Jamie Douglas & Sten Vikner (eds.), *Syntactic architecture and its consequences I: Syntax inside the grammar*, 511–525. Berlin: Language Science Press. DOI: 10.5281/zenodo.3972876

### Cherry Chit-Yu Lam

(2) HKC (Sio 2006: 14; Cheng & Sybesma 2005: 272) a. saam1 three \*(bun2) clf syu1 book 'three books' b. jat1 one \*(bui1) cup seoi2 water 'a cup of water'

This paper focuses only on cases of enumerated common count nouns such as (1a) and (2a), since measure words are necessary to license the counting of mass nouns even in non-classifier languages like English. Indeed, measure words such as those in (1b) and (2b) are termed as "massifiers" in Cheng & Sybesma (1998), which are different from (count-)classifiers as in the (a) sentences. Massifiers are there to *create* a unit of measure, while the count-classifiers, or classifiers in short, are only there to *name* the unit of counting which are inherent to the entity itself.<sup>1</sup> New data present a systematic pattern that the classifier can be optional, sometimes even disfavoured, in a [num+ clf + n] structure when the numeral size reaches a certain point. Furthermore, HKC has been found much less permissive with this exception than Mandarin. This new pattern challenges the traditional view (i.a. Krifka 1995; Chierchia 1998; Cheng & Sybesma 1999; 2005; 2012; Doetjes 1996) that numerals in a classifier language like Chinese obligatorily require licensing by the classifier; and forms a consistent picture with the general observation that Cantonese more strictly requires classifiers for individuation than Mandarin.

## **2 Beyond one, two, three: A new perspective**

### **2.1 Theoretical background: Krifka (1995) and Chierchia (1998)**

Krifka (1995) and Chierchia (1998) offer two classical analyses for Chinese-style classifier languages, where classifiers license enumeration.<sup>2</sup> Krifka suggests that the presence and absence of (the need for) classifiers is determined by whether the numerals in the language have a built-in measure function. In Mandarin, he argues, numerals do not come with such a measure function, hence whether the

<sup>1</sup>According to Cheng & Sybesma (1998), massifiers can be used with mass and count nouns, such as, *liǎng bēi shuǐ* 'two glasses of water' and *yī qún niǎo* 'a flock of birds' – massifiers with count nouns have also been known as "group classifiers" as pointed out by a reviewer.

<sup>2</sup>This numeral-licensing function of Chinese-style classifiers contrasts with the classifier system in languages like Japanese (Watanabe 2006), Purepecha (Vázquez-Rojas Maldonado 2012), and Niuean (Massam 2009), where numerals are classifier-licensing, i.e. classifiers can only occur when a numeral is present. This can be seen in the cases of [clf + n] in argument positions in both Cantonese and Mandarin, though the two varieties differ in terms of whether such noun phrases can appear as subjects or not (cf. Cheng & Sybesma 1999, Sio 2006).

### 22 Beyond one, two, three

measuring unit is an "object unit" (OU) – a unit that measures the number of specimens of a kind, or a "kind unit" (KU) – a unit that measures subspecies, is left underspecified (Krifka coined that as "object or kind unit" (OKU)). Assuming that OU or KU can only apply to objects but not kinds, the presence of a classifier not only specifies which measuring unit is in use, but also generates an object-referring interpretation for the entity denoted by the noun. The contrary is true in English. English numerals have this measure function inherently, and hence can express what [num+ clf] does in Mandarin. This distinction in measure function of numerals has been used to account for typological differences between classifier and non-classifier languages in Krifka (1995); but Bale & Coon (2014) has found in Mi'gmaq (Algonquian) and Chol (Mayan) that such a distinction can appear within a language. In other words, while numerals in different languages can vary in terms of present/absence of measure function hence producing non-classifier and classifier languages respectively, different numerals within a language can also vary in the same way. In the latter case, some numerals can go directly with count nouns, but some cannot. In Mi'gmaq, for instance, Bale & Coon reported that "numerals 1–5 (along with numerals morphologically built from 1–5) do not appear with classifiers, while numerals 6 and higher must" (Bale & Coon 2014: 700), as illustrated in (3–4).

(3) Mi'gmaq


(4) Mi'gmaq


On the other hand, Chierchia (1998) explains such difference between Mandarin and English, or rather classifier and non-classifier languages in general, by the inherent properties of their nominals. He suggests that all common nouns in (Mandarin) Chinese are mass nouns; and all mass nouns are inherently plural (a.k.a. "inherent plurality hypothesis"). Chierchia explains that count nouns are

### Cherry Chit-Yu Lam

inherently singular, and become pluralised when used to refer to a set of singularities. Singular count nouns form singleton sets and are rudimentary building blocks of all other plural sets (Chierchia termed them *atoms*). Thus, plural count nouns denoting a group of singularities are conceptualised as union relations (∪). To Chierchia, mass nouns are plural-like; only that plural count nouns are sets formed by union of *atoms* while mass nouns are "the closure under ∪ of *a set of atoms*" (Chierchia 1998: 70). In other words, mass nouns denote an enclosed union of all sets, and in that way, neutralize the difference between plural (i.e. sets) and singular (i.e. atoms). Therefore, Chierchia suggests that Mandarin common nouns provide a neat exemplar for the four mass nouns criteria in (5).

	- a. There is no plural marking.
	- b. A numeral can combine with a noun only through a classifier.
	- c. There is no definite or indefinite article.
	- d. Nouns can occur bare in argument position.

Focussing mainly on the second property concerning the distribution of numerals and classifiers, empirical data in §2.2 shows that the claim made in (5b) is too strong to hold. Turning back to Krifka's alternative, the proposal that the need for classifier stems from the absence of a measure function in numerals seem more plausible, especially with the re-interpretation in Bale & Coon (2014). However, the patterns in Mandarin and HKC are not as clear-cut as that in Mi'gmaq and Chol, which may pose a challenge to an analysis that is purely along the lines of Krifka.

### **2.2 Number size and classifiers**

One key observation made from the examples used in existing literature on Chinese classifiers is that most (if not all) examples are confined to the numerals one, two, and three. This study has examined numerals beyond three. Table 22.1 has the list of numerals tested; these are all cardinal numbers.

These nineteen numerals, ranging from 1 to 11000, are used with eight common count nouns in Mandarin and HKC to form noun phrases which appear as either subject or object in simple declarative sentences. The eight nouns considered are presented in Table 2. They vary in terms of degree of animacy (from human to inanimate) and number of syllables (mono- or disyllabic). (6) and (7) are some sample sentences.


Table 22.1: Chinese numerals

Table 22.2: Chinese nouns: Animacy and phonological size


### Cherry Chit-Yu Lam

### (6) Mandarin


### (7) HKC

	- I buy-pfv *two-ten-one clf dictionary*
	- 'I bought twenty-one dictionaries.'

Regarding the classifier–noun pairings in the study, all the common nouns under investigation are paired with the only appropriate classifier in the language. In Mandarin *gǒu* 'dog' appears with the classifier *zhí* (e.g. *ten \*gè*/*zhī gǒu*), and in HKC *syu6* 'tree' with *po1* (e.g. *saap6 \*go3*/*po1 syu6*). The only "exception" is with the [+human] nouns, as there are two possible classifiers for the noun *student* – a general classifier *gè*/*go3* and a specific one *wèi*/*wai2*. But for better comparison with the monosyllabic [+human] noun *person*, which cannot go with the specific classifier *wèi*/*wai2*, the classifier used for both *student* and *person* in this paper is the general classifier *gè*/*go3*.

In the acceptability judgment task, Mandarin and HKC native speakers<sup>3</sup> were asked to judge the acceptability of these sentences with and without classifiers. The judgement results have revealed several interesting patterns. First, both Mandarin and HKC speakers allow the [+human] count noun, *person*, to take the

<sup>3</sup>The results reported in this paper are taken from the acceptability judgment questionnaire from 2014. Four native Mandarin speakers and four native Hong Kong Cantonese speakers, aged 25–30, were consulted. Two of the Mandarin speakers were from Guangdong province, and the other two from northern China near Tianjin; samples of both varieties were genderbalanced. Participants were asked to rate sentences on a four-point scale (0–3). By comparing with control sentences, the scale of acceptability was established (in terms of average score): 2.8–3.0 = completely acceptable (✓), 1.8–2.7 = marginally acceptable (?), 1.3–1.7 = unacceptable (?/\*), 0.0–1.2 = absolutely unacceptable (\*). These terminologies will be consistently adopted in this paper. Since little regional variation has been found between southern and northern Mandarin speakers, and for the convenience of exposition, the average judgment scores will be presented in the text.

### 22 Beyond one, two, three

[num+ ∅ + n] structure, regardless of the value of the numeral. However, more precisely, in Mandarin, tested numerals higher than 10 are all rated acceptable whether in subject or object positions. In HKC, when the noun phrase appears as object, the numerals have to be greater than 30, but when the noun phrase appears as subject, only numerals higher than 100 are rated acceptable. All other sentences with [num+ ∅ + *person*] (as subject or object) are considered marginally acceptable (none completely ill-formed).

Down the scale of animacy, while HKC has a pattern consistent with the traditional understanding, i.e. numerals must be licensed by classifiers; Mandarin speakers allow null-classifier enumeration more liberally, especially with two sets of numerals. The first set involves high numerals 1000, 10000, and 11000. In Mandarin, subject noun phrases allow these three numerals to occur without the mediation of a classifier whenever the noun is animate (object noun phrases require a human noun).<sup>4</sup> Even with nouns of lower animacy, these three numerals consistently show a higher score in Mandarin null-classifier noun phrases. More importantly, in Mandarin, the presence of a classifier is not preferred when the noun *rén* 'person' occurs with these three high numerals: those Mandarin sentences are considered marginally acceptable (2.5 for subject, and 2.0 for object) when the classifier is present, and completely acceptable (3.0) when it is not. HKC noun phrases are much more restricted for such exceptions: apart from the noun *jan4* 'person', no other nouns can be enumerated without the presence of a classifier, however large the numeral is.

One possible explanation for such unmediated quantification could be that the classifier is still present in the structure but phonologically (partially) covert. An anonymous reviewer has pointed out that there is often a glottal stop between the numeral and the noun whenever the classifier is absent, presumably, where the noun is [+human] and hence the potential classifier would be *gè* in Mandarin or *go3* in HKC. In the Jin varieties of northern China, for instance, their equivalent of *gè* has been reported to have a final glottal stop in addition to the one in the onset.<sup>5</sup> If the same unmediated quantification is found in the Jin varieties, then what happens there could be that since there are two glottal stops in the classifier *gè*, one of them remains as the "residue" of the classifier and licenses the numeral in the place of the classifier itself.

However, empirically, the Mandarin and Cantonese speakers consulted in this study have not displayed such an articulatory feature, and even if it is indeed the case, the phonological reduction process could only be acting as an additional

<sup>4</sup> In any case, the noun concerned has to be disyllabic.

<sup>5</sup> I thank a reviewer for introducing me to the observations in the Jin varieties.

### Cherry Chit-Yu Lam

trigger for the omission of the classifier when the noun is [+human], but not as a sufficient condition to account for the selective permissiveness of [num+ ∅ +N] which is shown to be sensitive to animacy and number size. Otherwise, it would predict that (i) all [+human] nouns allow [num+ ∅ + n] regardless of number size, and (ii) all nouns that can appear with *gè*/*go3* (such as, *apple*, *ball*, and other [−animate] nouns) allow [num+ ∅ + n], but neither is empirically true. In fact, going back to the Mandarin and HKC data, despite the absence of a glottal stop in the coda position of the classifier *gè*/*go3*, there is one in the onset. So, if, as the phonological reduction hypothesis goes, the glottal stop between the numeral and the noun can act as a reduced form of the classifier, then the glottal stop in the onset may work as well as the one in the code position, but as aforementioned, such an articulatory feature has not been observed and the phonological reduction hypothesis alone would have overgeneralised the pattern of classifier-less enumeration in Mandarin and HKC.

Therefore, the classifier system in the Jin varieties certainly deserves further investigation, but based on the Mandarin and HKC data so far, a more plausible explanation for the observed exception is that big numbers like *yì qían* 'one thousand' and *yí wàn* 'ten thousand', like the English *thousands* and *millions,* are not numerals, but measure words (Lisa Cheng, p.c.). It is indeed the case that a measure word cannot co-occur with a classifier, as in (8).

(8) a. Mandarin

\* wǔ five jīn catty kē clf cài vegetable

b. HKC

\* saap6-jat1 ten-one doi6 bag go3 clf ping4guo2 apple

Nevertheless, it is important to note that even though the presence of a classifier may be disfavoured at times, [num+ clf + n] is never an unacceptable structure. In other words, the null-classifier structure is an additional option, but never the only available option. Therefore, I suggest that these high numerals have an inherent measure function emerging in Mandarin (à la Krifka 1995), but has not yet been grammaticalized into a proper measure word. Therefore, when these high numerals occur, the noun can either be individuated by the measure function of the numerals and does not require a classifier, or be individuated by the classifier. The preference for either of the two individuation strategies varies from one speaker to another.

Another exception happens with the numeral *one*. Mandarin speakers consider direct enumeration marginally acceptable when the count noun is disyllabic and

### 22 Beyond one, two, three

non-human. More specifically, when the noun phrase is a subject, *one* can go directly with any non-human count nouns (the scores range from 2.0 to 2.3); and when it is an object, the count noun must denote an animal or a plant (both scored 1.8) but not a completely inanimate object like *dictionary* (scored 1.3). A possible explanation for this pattern is that Mandarin is developing an indefinite article: the slight subject-object asymmetry in the acceptability of [*one* + n] may be a sign of this being a still ongoing development. Chierchia (1998) suggests that the indefinite article is simply a variant for the first numeral, and this is a well-established grammaticalisation pathway (Heine & Kuteva 2002). Therefore, what Chierchia predicts for Mandarin – there is "no morpheme that combines directly with a noun and means what *a* means in English" (Chierchia 1998: 91) – may not be correct, since the presence of *one* without the mediation of a classifier can be interpreted as an indefinite article (9).

(9) Mandarin yì one sōngshu pine.tree sǐ-le die-pfv 'One/a pine tree died.'

### **2.3 More than numbers**

The data presented in §2.2 boils down to one general conclusion: classifiers can be optional in licensing a numeral, especially in Mandarin, depending on the size of the numeral. This observation points to two issues: (i) numeral size can determine the necessity of classifiers for individuation – *one* and high numerals behave differently, and (ii) HKC classifiers are much more obligatory for individuation than Mandarin classifiers. The first issue has been discussed in the previous section, thus this section is devoted to discussing the cross-linguistic variations in the use of classifiers.

The difference between Mandarin and HKC in permitting [num+ ∅ + n] structures is consistent with a more general pattern that HKC more strictly requires the presence of classifiers for individuation. Figures 22.1 and 22.2 summarise the Mandarin and HKC classifier paradigms.

On the one hand, §2.2 has shown that HKC only allows null-classifier enumeration with the noun *jan4* 'person' and when the numeral is greater than 100 (for subject) or 30 (for object); on the other hand, Cheng & Sybesma (1999) have famously identified that HKC allows [clf + n] as both subject and object, whereas Mandarin only allows it as object. What appears to be two separate issues, can be rethought as one if we take another perspective on the second issue. HKC, in fact,


Figure 22.1: Mandarin classifier paradigm


### 22 Beyond one, two, three

does not allow bare common nouns in subject position (*mat6fong1* 'bee' in Figure 22.2), except when they act as proper names (*lou5ban2* 'boss' in Figure 22.2). Therefore, instead of viewing the second issue as Mandarin disallowing [clf + n] as subjects, it is more appropriate to see it as HKC requires a classifier for subject noun phrases with a common count noun. In that case, the two issues are unified to a general cross-linguistic variation that HKC more obligatorily requires the presence of a classifier for individuation, regardless of the need for enumeration. To account for this requirement in HKC, Cheng & Sybesma (1999) have suggested that classifiers express definiteness like the English determiner *the*, hence a classifier phrase (clfp) is projected whenever a definite reading arises. Since they report that both HKC [clf + n]s and Mandarin bare common nouns have a definite reading, the difference between the HKC strategy and the Mandarin one is that the former has an overtly articulated clf<sup>0</sup> while the latter has an empty clf<sup>0</sup> . In contrast, since bare common nouns in HKC are not definite, the classifier phrase which encodes definiteness is not projected in HKC bare common nouns. Therefore, assuming that Chinese requires a definite subject, bare common nouns cannot be subjects in HKC.

The issue of referentiality or definiteness can be a plausible explanation for the [clf + n] and bare noun distinction in HKC and Mandarin, but it does not provide an answer for the difference in numeral-licensing function of classifiers in the two Chinese varieties, since both [num+ clf + n] and [num+ ∅ + n] are indefinite.<sup>6</sup> The answer to this cross-linguistic variation in classifier use can be found in three related phenomena in Mandarin (none attested in HKC): (i) the development of *one* as an indefinite article (see §2.2); (ii) the presence of special forms for *two* and *three* – *liǎ* 'two/two of' and *sā* 'three/three of' (10); (iii) the use of plural marker *-men* for animate nouns/noun phrases (11).

(i) a. In Mandarin, [one e] is [−strong], triggering Agree with clf. b. In Cantonese, [one e] is [+strong], triggering Move of clf.

In short, Huang claims that Cantonese has a [+strong] number head, and Mandarin a [−strong] one. This interpretation of the classifier paradigms is insightful, but still fails to capture the new data on null-classifier enumeration presented in this paper.

<sup>6</sup>Huang (2015) views this [clf + n] pattern from another perspective: numeral requirement (more specifically, *one* requirement). He interprets that Cantonese allows bare classifier phrases in both subject and object positions, Mandarin restricts their occurrences to environments with a governing verb or preposition, and generally prohibits them in subject position. This observation is captured in the null numeral 'one' micro-parameter (i).

### Cherry Chit-Yu Lam

### (10) Mandarin


### (11) Mandarin


All three developments have one common property: the presence of classifiers become either optional or disallowed. The development of *one* as indefinite article in Mandarin allows the classifier to be optional when *one* appears with non-human (disyllabic) count nouns. The two special forms for *two* and *three* in Mandarin cannot occur with classifiers, because they themselves mean 'two of' and 'three of' respectively, meaning that they have inherent measure functions, just as the three high numerals 1000, 10000, and 11000. Finally, the fact that the Mandarin plural *-men* is much more developed than its HKC counterpart (*-dei6*) which can only suffix on pronouns, is another piece of evidence showing that Mandarin enumeration is less dependent on the use of classifiers. However, this only suggests that plural-marking and classifiers are competing strategies for the enumeration function, but not that they are morpho-phonological competitors, as they take up different structural positions. In Mandarin and Cantonese, for instance, classifiers are in pre-nominal position, while plural markers are postnominal. Borer formalises the difference as: "the plural marker is a spell-out of an abstract head feature 〈div〉 [divided] on a moved N-stem, while the classifier is an independent f[unction]-morph occurring in the left-periphery of the N" (2005: 95), as represented in (12a,b) below:<sup>7</sup>

<sup>7</sup>Adapted from Borer (2005: 95), the open value 〈e〉DIV is the classifier head, and 〈div〉 is the plural head feature. The co-superscripts (e.g. *max*) indicate range assignment relations.

22 Beyond one, two, three

Borer's explanation suffices for the complementary distribution of classifiers and plural markers but does not account for differences in the distribution of Cantonese and Mandarin classifiers.

## **3 Implications**

This paper has presented new empirical data from mainland Mandarin and HKC, and a new perspective in viewing the classifier paradigms of the two Chinese varieties, particularly regarding the variation in distribution of bare classifier phrases in subject position. While previous studies have examined the issue from the angle of definiteness-encoding (Cheng & Sybesma 1999) – bare nouns vs. bare classifier phrases, and strength of numeral head (Huang 2015) – numeral phrases with *one* vs. bare classifier phrases, neither can account for empirical cases where classifiers are optional in licensing numerals in Chinese (especially Mandarin). Therefore, this paper opens a new way to rethink this puzzle by showing (i) how numeral size, animacy, and phonological size can determine classifier obligatoriness, and (ii) three related phenomenon that happened exclusively in Mandarin which weaken the need for classifiers in its individuation function – *one* as indefinite article, special forms for *two* and *three*, and plural marker with animate count nouns. These together should offer a more unified picture for the use of classifiers in Mandarin and HKC.

## **Abbreviations**


### Cherry Chit-Yu Lam


## **References**


22 Beyond one, two, three


Abe, Jun, 497, 502 Abeillé, Anne, 76, 140, 142, 143, 148– 150 Abels, Klaus, 226, 227 Aboh, Enoch O., 280–288 Acedo-Matellán, Vıctor, ́ 376 Ackema, Peter, 187 Acquaviva, Paolo, 366 Adams, Nikki B., 34 Adger, David, 177–179, 183, 369, 461– 463, 465, 466, 474, 475, 478, 480–482, 487, 491, 493 Ahlers, Timo, 377 Alcorn, Rhona, 221, 225 Alexiadou, Artemis, 318, 321, 322, 366 Allen, Cynthia, 222, 223, 225, 227 Alotaibi, Mansour, 79 Alsina, Alex, 34 Alsulami, Abeer, 85 Ammer, Christine, 425 Anagnostopoulou, Elena, 318, 321– 323, 440 Arad, Maya, 440 Arsenijević, Boban, 16, 20, 95, 362 Asbury, Anna, 135 Baardewyk-Resseguier, Jacqueline van, 235 Bach, Emmon, 335 Bachet, Peter, 306 Bagchi, Tista, 297

Baker, James, 387, 388, 395, 401, 402, 405–408, 410–413, 416, 417 Baker, Mark C., 25, 32, 34, 40, 47, 82, 138, 335, 362, 374, 424, 437, 465 Bale, Alan, 513, 514 Baltin, Mark, 436 Bamba, Kaz, 20 Barr, Dale J., 434 Basilico, David, 350 Bauer, Laurie, 374 Bauke, Leah S., 358, 361, 367 Baunaz, Lena, 101 Baxter, Gareth J., 12 Bayer, Josef, 16, 234 Bazalgette, Tim, 42 Beadle, Richard, 189, 192 Beaudoin-Lietz, Christa, 43 Bejar, Susana, 17 Bell, Susan M., 436 Belletti, Adriana, 235 Bender, Emily M., 62 Benincà, Paola, 242 Benskin, Michael, 181, 209 Bentley, Delia, 387 Benveniste, Émile, 184 Bergstrom, Carl T., 5 Berman, Ruth A., 423 Berndt, Rolf, 176, 185, 186, 210 Berwick, Robert C., 296, 311, 362, 498 Beukema, Frits, 146

Biberauer, Theresa, vi, 25–28, 38, 39, 42, 47, 51, 52, 131, 139, 159, 168–171, 233, 234, 243, 245, 278, 280, 284, 285, 296, 351, 361, 366, 368, 369, 375, 497– 499 Blake, Barry J., 306 Bläsing, Uwe, 305 Bloom, Lois M., 17 Blythe, Richard A., 12 Bobaljik, Jonathan D., 46, 47, 179 Bobrow, Samuel A., 436 Boeckx, Cedric, 63, 366, 498 Bond, Oliver, 73 Booij, Geert, 374 Borer, Hagit, vi, 25, 297, 330, 332, 334–336, 338, 343, 348, 351, 361, 367, 368, 398, 399, 438, 478, 522 Börjars, Kersti, 176, 178 Borsley, Robert D., 65, 70, 71, 76–79, 81, 137, 148, 149 Bošković, Željko, 297, 367 Bostoen, Koen, 43 Boyd, Robert, 4 Boye, Kasper, 137, 147 Braune, Wilhelm, 187 Bresnan, Joan, 34, 38, 223, 436 Bril, Isabelle, 306 Britain, David, 177 Brown, Roger, 17 Bruyn, Adrienne, 287 Buchstaller, Isabelle, 177, 185 Buell, Leston C., 30, 32, 34, 38 Burge, Tyler, 296 Burrow, J.A., 189, 190, 198 Burzio, Luigi, 385, 394, 418 Butt, Miriam, 137

Byarushengo, Ernest Rugwa, 257, 258 Bárány, András, 47, 50 Börjars, Kersti, 135, 137–139, 142, 152 Campbell, Alistair, 187 Campbell, Eric William, 254 Campbell, Lyle, 135 Cardinaletti, Anna, 209, 234, 235 Carlson, Greg, 297, 482 Carstens, Vicki, 28, 32, 38, 40 Castañeda, Hector Neri, 309 Castilla-Earls, Anny, 17 Cavalli-Sforza, L. L., 4 Cawley, A.C., 189 Chan, Brian Hok-Shing, 278 Chao, Ke, 374 Chapman, Carol, 176, 178 Cheng, Lisa Lai-Shen, 511, 512, 519, 521, 523 Chierchia, Gennaro, 424, 438, 511– 514, 519 Childs, Claire, 177 Chomsky, Noam, vi, 20, 25, 26, 40, 64, 68, 82, 99, 113, 204, 223, 279, 296, 297, 311, 337, 351, 358, 362, 363, 366, 368, 371, 375, 437, 468, 497–499, 501, 503, 504, 506 Christensen, Ken Ramshøj, 146 Christidis, Anastasios Ph., 99, 100 Chung, Sandra, 462 Cinque, Guglielmo, 106, 107, 135, 234–236, 244, 287 Citko, Barbara, 376 Clark, Eve V., 374 Clark, Robin, 3, 12 Clements, George N., 256 Cocchi, Gloria, 42

Cognola, Federica, 233, 234, 236– 239, 242, 244–246 Cole, Marcelle, 181, 182, 185, 186, 210 Collins, Chris, 40, 362, 365, 369, 424, 437 Coniglio, Marco, 234, 235, 239 Consoli, Joseph P., 239, 247 Coon, Jessica, 513, 514 Cordin, Patrizia., 242 Corrigan, Karen, 176 Cowling, George H., 177 Crain, Stephen, 80 Cram, Daviod, 462, 466, 467 Creissels, Denis, 39 Croft, William, 12 Crow, James F., 4, 12 Crowley, Terry, 305 Cruz, Emiliana, 254 Cuervo, Ana Maria, 366 Culicover, Peter W., 16, 61, 64, 77, 82 Cutler, Ann, 436 Cysouw, Michael, 294 da Cruz, Maxime, 284, 285 Danielsen, Swintha, 303 Davidson, Clifford, 192, 193 Davidson, Donald D., 95, 294, 296, 309, 336 de Augusta, Félix José, 305 de Belder, Marijke, 358, 361, 365–367 de Haas, Nynke, 176, 178–183, 185, 188, 189, 207, 209, 211 de Kind, Jasper, 43 DeGraff, Michel, 287 Dehé, Nicole, 357 DeLancey, Scott, 389 Demuth, Katherine, 39 den Dikken, Marcel, 76, 146, 178, 470 Dench, Alan Charles, 306

Detges, Ulrich, 235, 236 Diercks, Michael, 28, 43 Dixon, R.M.W., 300, 305 Doetjes, Jenny, 512 Donohue, Cathryn, 81 Downing, Laura J., 274 Dowty, David R., 297, 331, 476 Dryer, Matthew S., 278, 279 Dubinsky, Stanley, 422 Dugatkin, Lee Alan, 5 Dunn, Michael John, 306 É. Kiss, Katalin, 357 Eades, Diana, 305 Embick, David, 317, 321–323, 361, 372 Emonds, Joseph, 160, 221, 227 Engel, Pascal, 296 Epstein, Samuel David, 363, 364, 501 Erbach, Gregor, 436 Evans, Nicholas, 16, 138 Everaert, Martin, 436 Faarlund, Jan Terje, 221, 227 Falk, Yehuda, 145 Fanselow, Gisbert, 297 Fedden, Sebastian, 303 Feder, Alison F., 12 Feldman, M. W., 4 Feng, Shengli, 358, 361 Fernández-Cuesta, J., 181, 192 Filip, Hana, 330, 332, 334, 335, 338, 349 Filppula, Markku, 209 Fischer, Olga, 209, 223 Flickinger, Dan, 139, 142 Fodor, Janet Dean, 63, 80, 83 Fowlie, Meaghan, 370 Fox, Danny, 498, 501 Franco, Ludovico, 101, 103

François, Jacques, 135 Friedmann, Naama, 19 Fromkin, Victoria, 374 Fruehwald, Josef, 323, 325, 326 Fuchs, Zuzanna, 38 Fukui, Naoki, 25 Fuß, Eric, 205, 207, 212 Gallego, Ángel J., 365, 505 Gamble, Geoffrey, 303 Gao, Meijia, 339 Gazdar, Gerald, 68 Gehrke, Berit, 440 Gianollo, Chiara, 26 Gibbs, Raymond. W., 436 Gibson, Hannah, 30, 31 Gil, David, 82 Ginzburg, Jonathan, 65–67, 72, 77, 83 Githinji, Peter, 34 Giusti, Giuliana, 42, 145 Givón, Talmy, 46, 93 Givón, Thomas, 16, 19 Godfrey, Elizabeth, 176 Grimshaw, Jane, 223, 279, 280 Groves, Tereb'ata R., 306 Gruzdeva, Ekaterina, 301 Guérin, Valérie, 303 Guerreiro, Yandira, 302 Guillaume, Antoine, 305 Guthrie, Malcolm, 29 Hack, Franziska M., 234 Haddican, William, 35 Haeberli, Eric, 160–164, 168, 169, 209 Haegeman, Liliane, 234 Hagège, Claude, 294, 295, 300, 302,

306, 308 Haiden, Martin, 357 Halbrook, Hal, 428

Halle, Morris, 184, 203, 212, 344, 358 Halpert, Claire, 28 Hamp, Eric, 209 Han, Chung-hye, 160 Harley, Heidi, 184, 205, 358, 361, 365, 366, 436, 440 Harris, Alice C., 387, 394 Haspelmath, Martin, 62, 74, 134, 142, 145, 146, 439 Haug, Dag Trygve Truslew, 371 Hauser, Marc D., 16, 279, 358 Hawkins, John A., 279, 282 Heacook, Paul, 425 Hegedűs, Veronika, 357 Heinat, Fredrik, 502 Heine, Bernd, 91, 92, 134, 135, 519 Henderson, Brent, 40 Henry, Alison, 178, 182, 206 Herburger, Elena, 297 Hernanz, Maria-Lluïsa, 235, 236 Hicks, Glyn, 505 Higginbotham, James, 297, 492 Hill, Virginia, 234 Hinterhölzl, Roland, 234 Hinzen, Wolfram, 16, 20 Holmberg, Anders, vi, 26, 35, 41, 42, 48, 63, 131, 234, 243, 335, 350, 351, 498, 500 Holmqvist, Erik, 185 Hopper, Paul J., 92 Hornstein, Norbert, 297, 363, 497, 502, 504, 505 Horvath, Julia, 422, 424, 426, 429, 436–438, 440 Hsieh, Feng-fan, 278 Hu, Wei, 358 Hu, Xuhui, 330, 339, 343, 350, 351 Hu, Zengyi, 374

Huang, C.-T. James, 342, 345, 350, 351, 359, 361, 521, 523 Huang, Lilian M., 302 Huber, Juliette, 302 Hudson, Richard, 178 Hugjiltu, Wu, 305 Hulk, Aafke, 209 Hyman, Larry M., 44, 45, 253–257, 259–263, 274 Idiatov, Dmitry, 294–296 Ihalainen, Ossi, 176 Ihsane, Tabea, 160–164, 168, 169 Iorio, David Edy, 40, 46 Irurtzun, Aritz, 297 Jackendoff, Ray, 16, 18, 61, 64, 77, 82 Jaeggli, Osvaldo A., 424 Janda, Richard D., 92 Janhunen, Juha, 305 Jenks, Peter, 45 Jerro, Kyle, 34 Jespersen, Otto, 187 Johannessen, Janne Bondi, 18 Johnson, Kyle, 370 Johnston, Paul A., 189, 190 Jones, Barbara Josephine, 305 Joseph, Brian D, 92, 135 Julien, Marit, 32 Kageyama, Taro, 360 Kallulli, Dalina, 103 Kandybowicz, Jason, 498 Katamba, Francis X., 253–257, 259, 260 Kathol, Andreas, 80 Kauhanen, Henri, 12 Kayne, Richard S., 26, 29, 80, 81, 94– 96, 106, 113, 137, 148, 149, 279, 357, 467, 497, 502

Keenan, Edward L., 116 Kehayov, Petar, 137 Kimbrough, Steven O., 12 Kimura, Motoo, 4, 11, 12 King, Pamela, 189 King, Tracy Holloway, 137 Kinyalolo, Kasangati K.W., 32 Kiparsky, Paul, 94, 203, 399 Klemola, Juhani, 176, 209 Koenig, Jean-Pierre, 82 Koontz-Garboden, Andrew, 438 Koopman, Hilda, 436 Koschmann, Timothy, 16 Koster, Jan, 145 Kratzer, Angelika, vi, 318, 319, 321– 323, 465, 477, 481 Krifka, Manfred, 332, 512, 513, 518 Kroch, Anthony, 159, 160, 162, 188, 209, 210 Kroeger, Paul, 138 Kula, Nancy C., 36 Kuryłowicz, Jerzy, 134 Kuteva, Tania, 92, 134, 519 Kuznetsova, Alexandra, 434 Langdon, Margaret, 301 Larson, Richard K., 309, 310 Lasnik, Howard, 223, 297 Law, Paul, 116 Ledgeway, Adam, 28, 33, 52, 242 Lee, Seunghun J., 253 Lefebvre, Claire, 288 Legate, Julie-Anne, 96 Levin, Beth, 385, 392, 398, 403, 406, 408, 413, 431, 438 Levinson, Stephen C., 16, 138 Li, Boya, 234, 278 Libben, Gary, 374 Lidz, Jeffrey, 505

Lieberman, Erez, 9 Liebesman, David, 296 Lightfoot, David W., 92, 137, 159, 160, 164, 170 Lin, Dong-yi, 307 Lin, Jo-Wang, 339 Lindstrom, Lamont, 301 Link, Godehard, 486 Livitz, Inna, 506 Lohndal, Terje, 297, 298, 361 Lonzi, Lidia, 235 Los, Bettelou, 226 Lowles, Alex, 19 Luo, Tianhua, 302, 303 Lusekelo, Amani, 50 Lynch, John, 301 MacDonald, Jonathan E., 330, 333– 335 MacKrell, Thomas, 428 Maho, J. F., 29 Maling, Joan, 223, 227 Manzini, M. Rita, 25, 95, 103, 105, 107, 234 Manzini, M.Rita, 297 Marantz, Alec, 47, 107, 184, 203, 344, 358, 361, 365, 366, 370, 372, 422, 438, 440 Marco, Cristina, 440 Marten, Lutz, 36, 38, 39, 48, 49 Massam, Diane, 17, 512 Matsuomoto, Yokyo, 226 May, Robert, 145 McCloskey, James, 462 McDaniel, Dana, 69 McElreath, Richard, 4 McGinn, Colin, 296 Mchombo, Sam, 34 McIntosh, Angus, 176, 186

McIntosh, Justin Daniel, 254 McIntyre, Andrew, 377, 440 McWhorter, John H., 16 Meeussen, A. E., 38 Meillet, Antoine, 91, 134 Meltzer-Asscher, Aya, 424, 437, 438, 440 Michelson, Karin, 82 Miller, George A., 20 Mittwoch, Anita, 319 Mletshe, Loyiso, 28 Mmusi, Sheila, 39 Montgomery, Michael, 176–178, 185 Montrul, Silvina, 411 Moran, P. A. P., 5 Moravcsik, Edith A., 46 Moro, Andrea, 16, 468, 470 Moshi, Lioba, 34 Mous, Maarten, 30 Mtenje, Al, 274 Müller, Stefan, 62, 65, 71, 74, 80, 83, 84 Munaro, Nicola, 151, 152, 234 Munn, Alan Boag, 18 Murphy, Gregory L., 12 Murray, James, 176 Myers, Scott, 32 Myler, Neil, 323, 325, 326 Müller, Gereon, 208 Narrog, Heiko, 91 Nedjalkov, Igor, 305, 374 Nedjalkov, Vladimir P., 301 Neeleman, Ad, 187 Nevins, Andrew, 16, 187 Newberry, Mitchell G., 9, 12 Newman, Stanley, 274 Newmeyer, Frederick J., 16, 62–64, 71, 92, 135

Ngoboka, Jean Paul, 34, 43 Ngonyani, Deo, 34 Nikitina, Tatiana, 371 Nikolaeva, Irina, 303 Nishiyama, Kunio, 358 Norde, Muriel, 242 Nordström, Jackie, 147 Noyer, Rolf, 184, 436 Nugteren, Hans, 305 Nunberg, Geoffrey, 437, 445 Nunes, Jairo, 370 Nye, Rachel, 97 O'Grady, William, 437 Obenauer, Hans-Georg, 234 Ogawa, Yoshiki, 358 Onishi, Masayuki, 306 Oseki, Yohei, 363, 364 Osumi, Midori, 305 Otaina, Galina A., 301 Padovan, Andrea, 235 Pagel, Mark, 9 Panagiotidis, Phoevos, 368, 374 Pancheva, Roumyana, 325 Parsons, Terence, 297, 317–319, 336 Partee, Barbara H., 296, 349, 492 Paul, Ileana, 116 Pearson, Matthew, 116, 124 Penello, Nicoletta, 235 Peng, Quiong, 339 Pérez-Leroux, Ana T., 17–19, 21 Perlmutter, David M., 385, 387, 392– 394, 396, 397, 407, 413, 415, 418 Pesetsky, David, 336, 501 Peterson, John, 306 Peterson, Tyler, 17–19 Pfau, Roland, 282

Pietroski, Paul, 297, 298 Pietsch, Lukas, 176–178, 180, 182, 185, 186, 206, 209, 210, 212 Pintzuk, Susan, 209 Plag, Ingo, 374 Plank, Frans, 136 Platzack, Christer, 146 Poletto, Cecilia, 151 Pollard, Carl, 62, 70, 140, 142 Pollock, Jean-Yves, 159, 160 Postal, Paul M., 65, 394 Potsdam, Eric, 116 Press, Margaret L., 303 Preston, Laurel B., 16 Pullum, Geoffrey K., 65, 145 Pylkkänen, Liina, 36, 376, 438 Quirk, Randolf, 187 Ralalaoherivony, Baholisoa Simone, 124 Ramadhani, Deograsia, 48, 49 Ramchand, Gillian, vi, 297, 298, 365, 376, 385, 387, 390, 403, 410, 438, 462, 463, 474–477, 481, 487, 491, 493 Ranaivoson, Jeannot Fils, 124 Rappaport Hovav, Malka, 385, 392, 398, 403, 406, 408, 413, 438 Rappaport, Malka, 431, 438 Reape, Mike, 80 Refsing, Kirsten, 374 Reiffenstein, Ingo, 187 Reinhart, Tanya, 337, 424, 438, 439, 498, 505, 506 Reuland, Eric, 505, 506 Richards, Marc, 497–501 Richards, Norvin, 367 Richerson, Peter J., 4

Riedel, Kristina, 34, 39 Ritter, Elizabeth, 184, 205 Rizzi, Luigi, 69, 113, 135, 151 Roberge, Yves, 20 Roberts, Ian, vi, 3, 25–28, 33, 38–42, 47, 48, 51, 52, 61, 63, 92–95, 104, 106, 107, 131, 135, 142, 159–161, 164, 168–171, 175, 181, 182, 185, 202, 204–206, 209, 212, 233, 234, 243, 245, 296, 311, 330, 335, 350, 351, 375, 466, 468, 497, 498 Rodeffer, John, 180, 188, 210 Roeper, Thomas, 16, 21, 367 Rooryck, Johan, 95, 371 Rosen, Carol, 397, 400, 411 Rosenbaum, Peter S., 96 Rothstein, Susan, 330, 331, 334, 335, 340, 349 Roussou, Anna, vi, 92–95, 99, 100, 103, 104, 135, 169, 170, 297, 468 Roy, Isabelle, 144, 151, 465, 471, 485 Rubin, Edward J., 362–364 Rugemalira, Josephat M., 34 Ruwet, Nicolas, 422, 437 Sabel, Joachim, 505 Sadock, Jerrold M., 300 Sag, Ivan A., 62, 64–67, 70, 72, 77, 81, 83, 140, 142 Saint-Dizier, Patrick, 135 Saito, Mamoru, 41 Samioti, Yota, 321, 323 Savoia, Leonardo M., 95, 103, 105, 107 Schadeberg, Thilo C., 34 Scheck, Raffael, 428 Schein, Barry, 297 Schendl, Herbert, 176

Schifano, Norma, 33, 233, 234, 236– 239, 242, 244–246 Schneider-Zioga, Patricia, 40 Scholz, Barbara C., 65 Schoorlemmer, Maaike, 398 Schreiner, Sylvia, 462, 465, 467, 472, 485 Segal, Gabriel M., 309, 310 Selkirk, Elisabeth, 253, 267, 273 Sheehan, Michelle, 26, 43, 48, 50, 234, 243, 277, 282 Shibatani, Masayoshi, 374 Shipley, William F., 301 Shomura, Yoko, 411 Sigmund, Karl, 7 Siloni, Tal, 422–424, 426, 429, 431, 436–440 Simango, Silvester R., 34, 422 Sio, Joanna U.-S., 362, 512 Skribnik, Elena, 305 Smeets, Ineke, 305 Smith, Carlota S., 330, 339, 340, 345 Smith, Jennifer, 177–179, 183 Smith, Lucy T., 189, 190 Snyder, William, 21 Soh, Hooi Ling, 339, 340 Song, Chenchen, 358, 359, 361, 366, 367, 377 Sorace, Antonella, 385, 388, 407, 411, 412 Spears, Richard. A., 425 Speas, Margaret, 16 Spencer, Andrew, 306 Sportiche, Dominique, 436 Sseikiryango, Jackson, 40 Stadler, Kevin, 12 Starke, Michal, 234 Sullivant, John Ryan, 254 Svantesson, Jan-Olof, 305

Svenonius, Peter, 144, 151, 369 Sweet, Henry, 180, 194 Swinney, David. A, 436 Sybesma, Rint, 278, 511, 512, 519, 521, 523 Tagliamonte, Sali, 176 Tai, James H. Y., 345 Takahashi, Daiko, 297 Taraldsen, Tarald, 357 Taylor, Ann, 162, 188, 209, 210, 221, 224 Teng, Stacy Fang-Ching, 302 Tenny, Carol L., 399 Terrill, Angela, 302 Thwala, Nhlanhla, 34 Toivonen, Ida, 139, 141 Tolskaya, Maria, 303 Torrego, Esther, 336 Tortora, Christina, 178 Traugott, Elizabeth Closs, 92, 106, 135 Travis, Lisa, 330, 333–335 Trips, Carola, 205, 207, 210 Trotzke, Andreas, 16 Trousdale, Graeme, 106, 135 Tseng, Jesse, 142, 143 Tsujimura, Natsuko, 359 Turville-Petre, Thorlac, 189, 190, 198 Uriagereka, Juan, 370, 498, 505 van Craenenbroeck, Jeroen, 388 van de Velde, Mark, 42, 259 van den Berg, Margot C., 284–286, 288 van den Berg, René, 306 van der Auwera, Johan, 294–296 van der Wal, Jenneke, 28, 32–38, 40, 43, 44

Van Eynde, Frank, 143 van Gelderen, Elly, 92, 94, 97–99, 105, 135, 146, 187 van Kemenade, Ans, 178–183, 188, 189, 207, 209, 211, 221–227 van Koppen, Marjo, 358, 365, 367 Van Riemsdijk, Henk C., 223 Van Valin, Robert D. Jr., 400, 403 Vanden Wyngaerd, Guido, 371 Varlokosta, Spyridoula, 99 Vat, Jan, 223, 225, 226 Vázquez-Rojas Maldonado, Violeta, 512 Vendler, Zeno, 330, 331 Vikner, Sten, 374, 376, 377 Villard, Stéphanie, 254 Vincent, Nigel, 135–138, 142, 144, 152 Vinet, Marie-Theiese, 235 von der Gabelentz, Georg, 234 Walkden, George, 227, 279, 282 Waltereit, Richard, 235, 236 Warner, Anthony, 159 Wasow, Thomas, 65, 426, 429 Watanabe, Akira, 512 Watters, John R., 259 Watumull, Jeffrey, 15 Weber, David John, 303 Weir, Andrew, 234 Wexler, Kenneth, 25 Weydt, Harald, 235, 242 White, James Gordon, 425 Wiltschko, Martina, 138 Wrenn, Charles L., 187 Yoder, Brendon, 82 Zaenen, Annie, 393, 398, 400 Zamparelli, Roberto, 492

Zanuttini, Raffaella, 234 Zeijlstra, Hedde, 361 Zeller, Jochen, 34, 43 Zhang, Niina Ning, 358, 361, 367 Zimmermann, Malte, 234, 239

# **Language index**

Ainu, 374 Albanian, 134 Amis, 307, 308 Arabic, 78, 79, 84, 85 Archi, 73, 78 Atayal, 302 Bangla, 234 Bantu, 25, 28–31, 31<sup>5</sup> , 33–35, 38, 39, 42, 42<sup>12</sup> , 43, 45–47, 51, 52, 253, 254, 259, 265, 273 Basaa, 44–48 Basque, 137, 295<sup>2</sup> , 309, 388, 407 Baure, 303<sup>11</sup> Belfast English, 178, 179<sup>5</sup> , 206 Bella Coola, 274 Bembe, 46, 47, 51 Bonan, 305 Brythonic Celtic, 181, 209<sup>42</sup> Bulgarian, 324, 325 Buryat, 305 Cantonese, 517 Caviñena, 305 Celtic, 462 Chemehuevi, 303 Chichewa, 38, 273, 274, 422 Chimwiini, 31<sup>5</sup> Chol, 513, 514 Chongqing Mandarin, 303 Chukchi, 306 Ciluba, 42, 43, 45, 47, 48

Croatian, 134 Czech, 338, 348 Dagur, 305<sup>12</sup> Danish, 136, 137, 144<sup>6</sup> , 145, 147 Dutch, 76, 187<sup>15</sup> , 223, 224, 226, 283, 365<sup>11</sup> , 393–395, 400, 407, 411 Dyirbal, 300<sup>8</sup> , 305, 306 Early Modern English, 159, 160, 163, 165–167, 170 English, 9, 9 3 , 19–21, 65, 69–71, 74, 75, 78, 80–85, 94, 96, 97, 99, 101, 105, 106, 117, 119, 122, 133, 134, 136, 137, 139, 143– 145, 159, 160, 161<sup>1</sup> , 162, 162<sup>3</sup> , 163<sup>4</sup> , 164, 168, 169, 169<sup>8</sup> , 170, 171, 283, 284, 286, 287, 299, 310, 318, 322, 324, 325, 326<sup>6</sup> , 329–334, 336–338, 341, 343, 345, 347, 357–361, 373–376, 385, 390, 398, 401, 402, 407, 408, 413, 421–427, 431, 435, 438, 440, 440<sup>13</sup> , 441 Erromangan, 305 Evenki, 305, 374 Faroese, 147 Finnish, 80, 399 Fongbe, 280, 284, 285 French, 20, 33, 76, 76<sup>15</sup> , 95, 136, 137, 143, 144, 144<sup>6</sup> , 148, 150, 281<sup>1</sup> , 283, 286–288, 411, 422, 470

### Language index

Georgian, 387, 388, 407 German, 19, 21, 69, 73, 136, 137, 144– 146, 148, 234, 242, 357, 376, 377, 411 Germanic, 133, 134, 144–146, 148, 285, 289, 329, 334, 335, 344, 350, 357, 374, 377 Gokana, 274 Greek, 92<sup>1</sup> , 93, 95, 96, 99, 99<sup>5</sup> , 100– 104, 134, 137 Gumbaynggir, 305<sup>13</sup> Gungbe, 281–283, 285, 288 Haitian Creole, 283, 287 Haya, 257, 258, 273, 274 Hebrew, 422, 422<sup>1</sup> , 423–427, 431, 435, 437, 438, 440, 440<sup>13</sup> , 451– 453 Herero, *see* Otjiherero Hong Kong Cantonese, 511, 512, 514– 516, 516<sup>3</sup> , 517, 519, 521–523 Huallaga Quechua, 303 Icelandic, 227, 500 Irish, 137, 461 Italian, 136, 137, 233–235, 235<sup>2</sup> , 235<sup>3</sup> , 236, 237, 239, 239<sup>7</sup> , 242, 246, 247, 397, 414 Italo-Romance, 238, 239, 239<sup>7</sup> , 242, 242<sup>10</sup> , 245, 246 Japanese, 20, 357, 359, 359<sup>4</sup> , 360, 512<sup>2</sup> Kalaallisut, 300, 300<sup>9</sup> , 301 Kalmuck, 305 Kavalan, 307 Khalkha Mongolian, 305 Kharia, 306 Kimeru, 36 Kinande, 40<sup>10</sup>

Kinyarwanda, 43, 47 Kiribati, 306 Korean, 137 Kwamera, 301 Kîîtharaka, 36 Ladin, 234 Latin, 103, 134, 136, 137, 144, 150 Lavukaleve, 302 Lubukusu, 36 Luganda, 40, 253–255, 255<sup>3</sup> , 257, 258, 262–268, 268<sup>10</sup> , 269– 271, 273, 274 Luguru, 36, 37, 48, 49 Lusoga, 253, 254, 262–265, 267, 273, 274 Maidu, 301 Makalero, 302 Makhuwa, 28, 32, 33, 44, 44<sup>14</sup> , 46–48, 51 Malagasy, 113–115, 115<sup>4</sup> , 116, 117, 119, 121, 123, 124, 127 Mandarin, 339, 340<sup>7</sup> , 341, 342<sup>10</sup> , 342<sup>9</sup> , 345<sup>14</sup> , 511–516, 516<sup>3</sup> , 517– 519, 521, 521<sup>6</sup> , 522, 523 Mangghuer, 305<sup>12</sup> Mapudungun, 305 Martuthunira, 306 Matengo, 28 Mavea, 303 Mi'gmaq, 513, 514 Mian, 303 Middle English, 161, 162, 170, 181, 181<sup>8</sup> , 185, 189, 207, 209, 210<sup>44</sup> , 213 Modern Standard Arabic, *see* Arabic Moghol, 305<sup>12</sup> Mongghul, 305<sup>12</sup>

Language index

Swedish, 136, 137, 139, 141, 143–148,

Motuna, 306 Niuean, 512<sup>2</sup> Nivkh, 301 Nyakyusa, 48, 50 Nêlêmwa, 306, 306<sup>15</sup> , 307 Oirat, 305<sup>12</sup> Old English, 133, 134, 163, 163<sup>4</sup> , 175, 221–225, 227, 228 Oneida, 82 Ordos, 305<sup>12</sup> Otjiherero, 36, 37 Persian, 80 Pitta-Pitta, 306 Polish, 74 Purepecha, 512<sup>2</sup> Puyuma, 302 Rangi, 30, 31 Riau Indonesian, 82 Romance, 33, 47, 94,133,134,137,144, 148,149<sup>7</sup> ,150, 239<sup>7</sup> , 287, 357, 374–376, 467 Romanian, 134 Russian, 333, 348, 431, 506 Santa, 305<sup>12</sup> Scottish Gaelic, 462–464, 466–472, 472<sup>3</sup> , 473–478, 480, 482– 491, 494 Shira Yughur, 305<sup>12</sup> Shona, 36 Slavic, 329, 330, 333–335, 343–345, 348, 349, 351, 374 Southern Sotho, 36, 37 Spanish, 76, 76<sup>15</sup> , 136, 235<sup>2</sup> , 236, 375, 377, 431 Swahili, 29, 36, 134

150 Tagalog, 113–117, 119, 119<sup>6</sup> , 120–125, 127–129 Tianjin Mandarin, 302, 303 Tibetan, 389 Tinrin, 305 Tunen, 30, 31 Turkish, 137, 393 Udihe, 303, 304 Vitu, 306 Wangkajunga, 305 Warlpiri, 81 Wayuu, 302 Welsh, 65<sup>5</sup> , 74, 80, 81, 180, 181, 466 Wikchamni, 303 Wu Chinese, 339<sup>5</sup> Yixing Chinese, 329, 330, 339–343, 345, 346, 351 Yongxin Gan, 303 Zulu, 30, 31, 34, 36, 38

absolutive case, 115<sup>2</sup> , 120 accusative case, 84, 85, 105, 106, 136, 397 adjunction, 16, 18, 139, 161, 357, 359, 361–364, 366, 488 adverbs, 126, 130, 159–162, 162<sup>2</sup> , 163– 167, 170, 176, 180, 211, 225, 234, 235, 319, 320 Agree, 40, 65, 67, 168, 180, 204, 207, 358, 371, 372, 521<sup>6</sup> agreement, 33, 39, 40, 42, 42<sup>11</sup> , 43, 44, *see also* object marking, 47, 79, 83, 175–178, 178<sup>3</sup> , 179– 183, 183<sup>9</sup> , 184–191, 193, 198, 200, 202–204, 204<sup>35</sup> , 205, 205<sup>36</sup> , 206, 206<sup>39</sup> , 207–209, 209<sup>41</sup> , 210, 210<sup>43</sup> , 212, 213, 467, 469, 470, 473, 485, 486 complementizer agreement, 209 number agreement, 78, 202, 207, 212, 486 object agreement, 38, 46 Spec–head agreement, 336 subject agreement, 38, 46, 65, 78, 79, 204, 469 agreement weakening, 182, 187, 209, 210<sup>43</sup> , 211 alignment, 37 animacy, 48, 101, 225, 514, 517, 518, 523

auxiliaries, 5, 30<sup>3</sup> , 42, 69, 71, 77, 134, 145, 162, 163<sup>3</sup> , 164–166, 168– 171, 179<sup>5</sup> , 195, 195<sup>27</sup> , 196, 202, 285, 289, 320–322, 324, 394, 397, 411, 412, 414, 424, 466

binding, 497, 499, 500, 502, 503, 505, 505<sup>8</sup> , 505<sup>9</sup> , 506 Borer–Chomsky conjecture, 25, 74, 350, 362

case

abstract Case, 28, 28<sup>1</sup> , 43<sup>13</sup> , 47 Case assignment, 47 case features, 74 Case licensing, 28<sup>1</sup> , 35, 37, 43 morphological case, 73, 178 categorial distinctness, 367 classifiers, 511, 512, 512<sup>2</sup> , 513, 514, 516–521, 521<sup>6</sup> , 522, 522<sup>7</sup> , 523 clefts, 30<sup>3</sup> , 114, 116, 117, 119, 122, 125, 129, 465, 468, 471, 472<sup>3</sup> , 473, 474, 486–488, 491, 492 clitics, 149<sup>8</sup> , 182, 205, 206, 211, 213, 224, 226, 227, 268–272, 273<sup>12</sup> complementizers, 29, 30, 40, 42, 44, 65, 79, 92–94, 94<sup>2</sup> , 95–108, 113, 118, 119, 122, 124, 133,

143, 147, 148, 149<sup>7</sup> , 151, 152, 222, 225, 227–229, 280, 289 complexity, 15, 16, 16<sup>1</sup> , 17–22 compounding, 260, 357, 358, 361, 367, 372, 374, 377 control, 149, 324, 516<sup>3</sup> coordination, 17, 18, 176 copulas, 114, 121, 127, 128, 163<sup>3</sup> , 164, 165, 168, 171, 463<sup>1</sup> , 474, 486– 489, 491 dative case, 144, 228, 431, 431<sup>7</sup> defective goal, 40 demonstratives, 45, 94–96, 119, 147, 228, 266, 267 Distributed Morphology, 344<sup>13</sup> , 440 ditransitive constructions, 34, 36, 306, 417 DOC, *see* double object construction double object construction, 29 E-language, 113, 114 ellipsis, 322 elsewhere condition, 176, 183, 184, 187, 203, 204, 205<sup>36</sup> , 206, 502 enumeration, 517–519, 521<sup>6</sup> , 522 EPP, *see* extended projection principle ergative alignment, 50<sup>15</sup> expletives, 118, 128, 129, 474 extended projection principle, 468 final-over-final condition, 277–284, 286, 289 focus, 30<sup>3</sup> , 74, 103, 113–116, 119–129, 234<sup>2</sup> , 259, 260, 262 contrastive focus, 30 FOFC, *see* final-over-final condition

fronting of arguments, 343<sup>12</sup> of pronouns, 224 functional items, 25, 26, 32, 35–37, 40, 42, 92, 120, 125, 133– 135, 138, 144, 149–151, 168, 204, 234, 243, 329, 330, 335<sup>1</sup> , 336, 337, 344, 344<sup>13</sup> , 348– 350, 361, 362, 364, 367, 369, 369<sup>21</sup> , 385, 403, 424 genitive case, 17, 18, 21, 85, 136, 268 grammaticalization, 91, 92, 92<sup>1</sup> , 93– 95, 97, 103–109, 133–135, 149, 152, 195, 229, 360, 361, 518, 519 head movement, 29, 32–34, 40, 375 Head-Driven Phrase Structure Grammar, 62, 64, 64<sup>4</sup> , 65, 66, 66<sup>6</sup> , 67, 68, 70–75, 78–80, 82, 83, 85, 134, 137<sup>1</sup> , 140, 143, 144, 148, 152 Holmberg's generalization, 501, 502 HPSG, *see* Head-Driven Phrase Structure Grammar I-language, 114 idioms, 421, 422, 422<sup>1</sup> , 423, 424, 424<sup>2</sup> , 425–428, 428<sup>5</sup> , 429, 429<sup>6</sup> , 430–436, 436<sup>10</sup> , 436<sup>9</sup> , 437, 437<sup>10</sup> , 437<sup>11</sup> , 437<sup>12</sup> , 438–440, 440<sup>13</sup> , 440<sup>14</sup> , 441, 442, 445, 446, 451 implicational relations, 36, 44, 47, 51, 243 implicit arguments, 102, 439, 440, 440<sup>13</sup> impoverishment, 184, 184<sup>11</sup> , 202, 203, 208, 208<sup>40</sup> , 211–213

incorporation, 146, 175, 202, 205, 206, 208, 211, 213, 357, 358, 361, 366, 372, 377 inner aspect, 329–331, 335, 336, 339, 344–346, 351 islands, 363, 504<sup>7</sup> labelling, 15, 16, 358, 364, 367, 371<sup>23</sup> , 372 language acquisition, 15, 16, 21, 26– 28, 50<sup>16</sup> , 61, 62, 65, 71, 73, 83, 83<sup>21</sup> , 113, 135, 280, 286, 374, 405, 415, 417 language change, 3, 4, 8, 10, 11, 13, 91, 114, 137, 278 LCA, *see* linear correspondence axiom left dislocation, 257, 257<sup>7</sup> , 264–266, 273, 274 lexical categories, 359, 360, 365, 368, 374 Lexical Functional Grammar, 133– 135, 137<sup>1</sup> , 138–140, 143, 146, 149, 149<sup>7</sup> , 150, 152, 153 LFG, *see* Lexical Functional Grammar linear correspondence axiom, 279, 280, 282, 288 Merge, 15, 16, 18, 20, 21, 67, 71, 98, 152, 279, 357, 358, 362–365, 369<sup>22</sup> , 372, 376, 377, 497, 498, 498<sup>2</sup> , 499–503, 503<sup>6</sup> , 504–506 microvariation, 28, 37, 49 mirror principle, 32 movement, 29, 30<sup>3</sup> , 65, 80, 92, 93, 98, 135, 222, 223, 226, 227, 229, 388<sup>2</sup> , 467, 468, 473, 478, 491, 499–502, 504<sup>8</sup>

366<sup>17</sup> , 372 nanosyntax, 142, 144 nominative case, 28<sup>1</sup> , 74, 115<sup>2</sup> , 178, 178<sup>4</sup> , 184 Northern subject rule, 176, 177, 179, 179<sup>5</sup> , 180, 181, 181<sup>8</sup> , 182– 185, 185<sup>12</sup> , 186, 186<sup>13</sup> , 187– 190, 190<sup>19</sup> , 191–193, 197–199, 199<sup>31</sup> , 200, 202, 204, 204<sup>35</sup> , 205, 205<sup>36</sup> , 206–209, 209<sup>42</sup> , 210, 210<sup>43</sup> , 210<sup>44</sup> , 212–214 null pronouns, 221, 224–227, 229 null subjects, 79, 506 numerals, 258, 271, 484, 511, 512, 512<sup>2</sup> , 513, 514, 517–519, 521<sup>6</sup> , 523 object marking, 34, 36, 40, 42, 44, 48– 50 parameter hierarchies, 26, 29, 37, 40, 41, 45–47, 51, 235, 243, 350, 351 parameters, 25–29, 37, 38, 40–43, 48, 50, 50<sup>16</sup> , 51, 62–64, 64<sup>3</sup> , 76<sup>14</sup> , 82, 113, 114, 131, 159, 169, 171, 226, 245, 329, 330, 339<sup>4</sup> , 350 participles, 317, 318, 320, 321, 323, 324, 386, 391, 400, 401, 403, 405, 410 passive, 32, 33<sup>6</sup> , 40, 317, 318, 320, 320<sup>3</sup> , 324, 327, 392–394, 421–428, 428<sup>5</sup> , 429, 429<sup>6</sup> , 430, 431, 431, 432<sup>7</sup> , 435– 437, 437<sup>12</sup> , 438–440, 440<sup>13</sup> , 440<sup>14</sup> , 441, 446, 450, 465, 475<sup>4</sup> Phase impenetrability condition, 19

multidominance, 362, 364, 366,

phases, 18, 19, 27, 33, 36, 37, 44, 45, 48, 226, 227, 364, 370, 372<sup>26</sup> , 376, 377, 497, 498<sup>1</sup> , 499, 499<sup>3</sup> , 500–503, 503<sup>6</sup> , 504, 505, 505<sup>9</sup> , 506 phonological phrases, 255<sup>2</sup> , 255<sup>3</sup> , 256<sup>5</sup> , 258, 259, 262, 264, 273 PIC, *see* Phase impenetrability condition PLD, *see* primary linguistic data possessors, 485 predication, 296, 297, 310, 461, 462, 463<sup>1</sup> , 464, 465, 468, 471, 473, 474, 476, 480, 482, 486–489, 492, 493 preposition stranding, 221, 223–225, 227–229 primary linguistic data, 27, 29 pro-verbs, 293, 295, 296, 300, 300<sup>9</sup> , 301, 305, 309 prosodic domains, 207, 254, 255, 258, 262, 264, 267, 271, 273, 274 pseudo-clefts, *see also* clefts, 114, 117, 119, 122, 125, 126 raising, 65, 116, 149 reanalysis, 91, 93, 95, 98, 99, 103– 105, 108, 113, 114, 116, 125, 129, 130, 160, 169, 169<sup>10</sup> , 170, 194<sup>26</sup> , 211–213 recursion, 15, 16, 16<sup>1</sup> , 16<sup>2</sup> , 17, 18, 20– 22 relational grammar, 392 relative clauses, 17, 19, 20, 45, 46, 65, 70, 72, 81–83, 97, 99<sup>5</sup> , 101, 102, 121, 122, 221, 222, 470, 471, 484, 488, 490, 491, 493 resumptive pronouns, 491 right dislocation, 256, 257, 273, 274

roll-up movement, 29, 30<sup>3</sup> self-embedding, 20, 21 SMT, *see* strong Minimalist thesis split intransitivity, 392, 397, 398, 403, 404, 407, 410, 411, 415, 418 strong Minimalist thesis, 364, 365, 497, 498, 500, 506 syntactic categories, 91, 93–95, 99, 101, 103, 105, 108, 140, 144, 148, 149, 360 syntax–phonology interface, 254, 273 telicity parameter, 339, 344 thematic roles, 297, 298, 298<sup>7</sup> , 299, 300, 307, 309, 388<sup>2</sup> tone, 253, 254, 259–262, 264, 267, 268, 270, 271, 273 topic, 74, 78, 79, 115<sup>2</sup> , 120, 341, 342, 342<sup>10</sup> topicalization, 227, 500 unaccusative hypothesis, 385, 387, 392–395, 403, 405, 408, 413, 418 unaccusativity, 392, 393, 395–400, 400<sup>4</sup> , 405, 408, 410, 413, 422, 423, 425<sup>3</sup> , 437, 439, 440 unergativity, 395, 397–400, 405, 407, 408, 413, 414 Universal Grammar, 362, 364, 407, 506 verb movement, 29, 159, 160, 160<sup>1</sup> , 161, 161<sup>1</sup> , 164, 168, 169, 169<sup>8</sup> , 170, 209, *see also* head movement, 357, 358, 376, 377

verb-initial constituent order, 76–81, 114, 116, 117 verb-internal modifiers, 358, 359, 359<sup>4</sup> , 360–362, 364–367, 370, 373, 377 wh-words, 295, 296<sup>3</sup> , 298, 310 verbal wh-words, 293–296, 299, 300

φ-features, 29, 39, 40, 40<sup>10</sup> , 41, 42, 44–48, 50, 51, 182, 202, 207, 211, 467, 469, 470

# Syntactic architecture and its consequences I

This volume collects novel contributions to comparative generative linguistics that "rethink" existing approaches to an extensive range of phenomena, domains, and architectural questions in linguistic theory. At the heart of the contributions is the tension between descriptive and explanatory adequacy which has long animated generative linguistics and which continues to grow thanks to the increasing amount and diversity of data available to us.

The chapters address research questions on the relation of syntax to other aspects of grammar and linguistics more generally, including studies on language acquisition, variation and change, and syntactic interfaces. Many of these contributions show the influence of research by Ian Roberts and collaborators and give the reader a sense of the lively nature of current discussion of topics in synchronic and diachronic comparative syntax ranging from the core verbal domain to higher, propositional domains.

This book is complemented by two other volumes.